Machine Learning Architect: For all your AI and GenAI needs

  • TensorFlow versus Vertex AI on GCP

    On Google Cloud Platform (GCP), TensorFlow and Vertex AI serve different roles in the machine learning lifecycle, each with its specific features and purposes: ### TensorFlow: 1. **Framework**: – TensorFlow is an open-source machine learning framework developed by Google. It is used to build, train, and deploy machine learning models. 2. **Capabilities**: – Provides tools…

  • Principal Component Analysis (PCA) versus Feature Crossing

    Feature crossing and Principal Component Analysis (PCA) are both techniques used in machine learning to manipulate features, but they serve different purposes and operate differently. Feature Crossing Feature crossing involves creating new features by combining existing ones. This technique is particularly useful in linear models, where interactions between features can help the model capture more…

  • K-Means Clustering Use Cases

    K-Means Clustering Purpose: K-means is an unsupervised learning algorithm used for clustering data into a predefined number of clusters (k). How It Works: Initialization: Select k initial cluster centroids randomly. Assignment: Assign each data point to the nearest cluster centroid. Update: Recalculate the centroids of the clusters based on the assigned data points. Iterate: Repeat…

  • Q learning for optimizing the distribution of Nodes in a network

    Optimizing Distribution of Nodes in a Wireless Network Using Q-Learning Q-learning, a type of reinforcement learning, can be used to optimize the distribution of nodes in a wireless network. The main goal is to improve network performance by finding optimal placement strategies for the nodes. Here’s a step-by-step approach to applying Q-learning for this purpose:…

  • TensorFlow’s Object Detection API

    TensorFlow’s Object Detection API is an open-source framework built on top of TensorFlow that allows users to easily build, train, and deploy object detection models. It provides a comprehensive set of tools for working with images and videos to detect objects, such as people, cars, animals, and more. Key Features Pre-trained Models: Access to a…

  • Change Data Capture using Delta Live Tables (Databricks)

    Implementing Change Data Capture (CDC) techniques and managing Delta Live Tables (DLT) for real-time data integration and analytics involves a series of steps to ensure that changes in the source data are captured and reflected in the target data store in real-time. Here’s a comprehensive guide on how to achieve this: Change Data Capture (CDC)…

  • K Nearest Neighbors versus Q Learning Algorithms

    Q-Learning and k-Nearest Neighbors (k-NN) are two distinct algorithms used in different areas of machine learning. Here’s a comparison to highlight their differences: Q-Learning Overview: Type: Reinforcement Learning Purpose: To find the optimal action-selection policy in a given environment by maximizing cumulative rewards. Learning Method: Model-free learning from interactions with the environment. Core Concept: Uses…

  • Connecting Looker Studio to On Premises Databases

    First ensure that the database IP is reachable from Looker IPs (there’s a set of known IPs that Looker uses). Once that is validated, it is a matter of  Creating a new connection in Looker studio for the db Allowing the Looker IPs to connect – (in the target db) The list of known Looker…

  • Bitcoin Analytics

    Find Duplicate Transactions A single transaction can only belong to a single block. However, in earlier versions of bitcoin (due to a different database),there was a transaction that was added to two blocks. This transaction can be discovered using this query: select * from (select transaction_id, count(transaction_id) as duplicates from `bigquery-public-data.bitcoin_blockchain.transactions` group by transaction_id) where duplicates > 1 Find wallets with over 1,000 btc Find Transactions which transferred over 1000 BTC

  • AutoML versus CloudML versus SparkML (DataProc)

    Overview – Training Sets Training Sets are split into 70% 30%. The first 70% is for training, the second 30% is for tuning the model’s parameters. AutoML Google’s AutoML lets you perform the training with as few as 10-12 items (e.g. Vision AutoML requires a dozen or so images to start training). Google provides the…

Got any book recommendations?