A
30Data Science
A/B Testing
A/B testing is an experimental method that compares two versions of a model, prompt, or interface to determine which performs better. In AI, A/B testing helps evaluate model outputs, UI changes, and prompt strategies by measuring user engagement or accuracy.
Read entryNatural Language Processing
Abstractive Summarization
Abstractive summarization generates new text that captures the key points of a longer document, rather than simply extracting existing sentences. It requires deep language understanding and generation capabilities.
Read entryMachine Learning
Accuracy
Accuracy is a metric that measures the proportion of correct predictions out of total predictions made by a model. While intuitive, accuracy can be misleading on imbalanced datasets where one class dominates.
Read entryDeep Learning
Activation Function
An activation function is a mathematical function applied to a neuron's output to introduce non-linearity into a neural network. Common activation functions include ReLU, sigmoid, and tanh, each with different properties for gradient flow.
Read entryMachine Learning
Active Learning
Active learning is a machine learning approach where the model selectively queries an oracle (often a human) for labels on the most informative data points. This reduces the total amount of labeled data needed to train an accurate model.
Read entryDeep Learning
Adam Optimizer
Adam (Adaptive Moment Estimation) is an optimization algorithm that combines the benefits of AdaGrad and RMSProp. It adapts learning rates for each parameter using estimates of first and second moments of gradients.
Read entryDeep Learning
Adapter Layers
Adapter layers are small trainable modules inserted into a pre-trained model to enable parameter-efficient fine-tuning. They allow task adaptation while keeping the original model weights frozen.
Read entryAI Ethics & Safety
Adversarial Attack
An adversarial attack is a technique that creates deliberately crafted inputs designed to fool a machine learning model into making incorrect predictions. These attacks reveal vulnerabilities in AI systems and are critical to AI safety research.
Read entryAI Ethics & Safety
Adversarial Training
Adversarial training is a defense strategy that improves model robustness by including adversarial examples in the training data. The model learns to correctly classify both normal and adversarially perturbed inputs.
Read entryAI Applications
Agent
An AI agent is an autonomous system that perceives its environment, makes decisions, and takes actions to achieve specific goals. Modern AI agents can use tools, browse the web, write code, and chain multiple reasoning steps together.
Read entryAI Applications
Agentic AI
Agentic AI refers to AI systems that can autonomously plan, reason, and execute multi-step tasks with minimal human oversight. These systems use tool calling, memory, and iterative problem-solving to accomplish complex goals.
Read entryFundamentals
AGI
Artificial General Intelligence (AGI) refers to a hypothetical AI system with human-level cognitive abilities across all intellectual tasks. Unlike narrow AI, AGI would be able to learn, reason, and solve problems in any domain without task-specific training.
Read entryAI Ethics & Safety
AI Alignment
AI alignment is the research field focused on ensuring that AI systems pursue goals and behaviors consistent with human values and intentions. Alignment is considered one of the most important challenges in AI safety.
Read entryAI Infrastructure
AI Chip
An AI chip is a specialized processor designed specifically for artificial intelligence workloads like neural network training and inference. Examples include NVIDIA's GPUs, Google's TPUs, and custom ASICs.
Read entryAI Ethics & Safety
AI Ethics
AI ethics is the branch of ethics that examines the moral implications of developing and deploying artificial intelligence systems. It addresses fairness, transparency, privacy, accountability, and the societal impact of AI technology.
Read entryAI Ethics & Safety
AI Safety
AI safety is the interdisciplinary field focused on ensuring AI systems operate reliably, beneficially, and without causing unintended harm. It encompasses alignment, robustness, interpretability, and governance research.
Read entryAI Applications
AI Visibility
AI visibility refers to how prominently a brand, product, or entity appears in AI-generated responses from systems like ChatGPT, Perplexity, and Gemini. As AI-powered search grows, visibility in AI recommendations becomes a critical marketing metric.
Read entryFundamentals
AI Winter
An AI winter is a period of reduced funding, interest, and research progress in artificial intelligence. Historical AI winters occurred in the 1970s and late 1980s, often following inflated expectations and undelivered promises.
Read entryFundamentals
Algorithm
An algorithm is a step-by-step procedure or set of rules for solving a computational problem. In AI, algorithms define how models learn from data, make predictions, and optimize their performance.
Read entryData Science
Annotation
Annotation is the process of adding labels or metadata to raw data to create training datasets for supervised learning. Data annotation can involve labeling images, tagging text, or marking audio segments.
Read entryMachine Learning
Anomaly Detection
Anomaly detection is the identification of data points, events, or patterns that deviate significantly from expected behavior. AI-based anomaly detection is used in fraud prevention, cybersecurity, and industrial monitoring.
Read entryAI Infrastructure
API
An API (Application Programming Interface) is a set of protocols and tools that allows different software systems to communicate. AI APIs enable developers to integrate machine learning capabilities like text generation, image recognition, and speech processing into applications.
Read entryFundamentals
Artificial General Intelligence
Artificial General Intelligence is a theoretical form of AI that would match or exceed human cognitive abilities across all domains. AGI remains an aspirational goal rather than a current reality in AI research.
Read entryFundamentals
Artificial Intelligence
Artificial Intelligence is the broad field of computer science focused on creating systems that can perform tasks requiring human-like intelligence. AI encompasses machine learning, natural language processing, computer vision, and robotics.
Read entryFundamentals
Artificial Narrow Intelligence
Artificial Narrow Intelligence (ANI) refers to AI systems designed to perform specific tasks, such as image recognition or language translation. All current AI systems, including large language models, are forms of narrow intelligence.
Read entryFundamentals
Artificial Superintelligence
Artificial Superintelligence (ASI) is a hypothetical AI that would surpass human intelligence in every cognitive dimension. The prospect of ASI raises profound questions about control, alignment, and the future of humanity.
Read entryDeep Learning
Attention Mechanism
An attention mechanism allows neural networks to focus on the most relevant parts of the input when producing each element of the output. Attention is the foundational innovation behind the Transformer architecture and modern large language models.
Read entryDeep Learning
Autoencoder
An autoencoder is a neural network trained to compress input data into a compact representation and then reconstruct it. Autoencoders are used for dimensionality reduction, denoising, and learning latent representations.
Read entryMachine Learning
AutoML
Automated Machine Learning (AutoML) is the process of automating the end-to-end pipeline of applying machine learning, including feature engineering, model selection, and hyperparameter tuning. AutoML democratizes AI by reducing the expertise required.
Read entryRobotics & Automation
Autonomous Systems
Autonomous systems are AI-powered machines that can operate and make decisions independently without continuous human supervision. Examples include self-driving cars, delivery drones, and robotic warehouse systems.
Read entryB
16Deep Learning
Backpropagation
Backpropagation is the algorithm used to train neural networks by computing gradients of the loss function with respect to each weight. It propagates error signals backward through the network to update weights and minimize prediction errors.
Read entryMachine Learning
Bagging
Bagging (Bootstrap Aggregating) is an ensemble technique that trains multiple models on random subsets of training data and combines their predictions. Random Forest is the most well-known bagging-based algorithm.
Read entryDeep Learning
Batch Normalization
Batch normalization is a technique that normalizes layer inputs across mini-batches during training to stabilize and accelerate neural network training. It reduces internal covariate shift and allows higher learning rates.
Read entryDeep Learning
Batch Size
Batch size is the number of training examples used in one iteration of gradient descent. Larger batches provide more stable gradient estimates but require more memory, while smaller batches add beneficial noise.
Read entryMachine Learning
Bayesian Network
A Bayesian network is a probabilistic graphical model that represents variables and their conditional dependencies using a directed acyclic graph. It enables reasoning under uncertainty and causal inference.
Read entryNatural Language Processing
Beam Search
Beam search is a decoding algorithm that explores multiple candidate sequences simultaneously, keeping only the top-k most promising at each step. It balances between greedy decoding and exhaustive search in text generation.
Read entryData Science
Benchmark
A benchmark is a standardized test or dataset used to evaluate and compare the performance of different AI models. Common benchmarks include MMLU, HumanEval, and ImageNet.
Read entryNatural Language Processing
BERT
BERT (Bidirectional Encoder Representations from Transformers) is a language model developed by Google that reads text in both directions simultaneously. BERT revolutionized NLP by enabling deep bidirectional pre-training for language understanding tasks.
Read entryAI Ethics & Safety
Bias in AI
Bias in AI refers to systematic errors or unfair outcomes in machine learning models that arise from biased training data, flawed assumptions, or problematic design choices. Addressing AI bias is essential for building fair and equitable systems.
Read entryMachine Learning
Bias-Variance Tradeoff
The bias-variance tradeoff is the fundamental tension in machine learning between model simplicity (high bias) and model flexibility (high variance). Optimal models balance underfitting and overfitting to generalize well to new data.
Read entryNatural Language Processing
Bigram
A bigram is a contiguous sequence of two items (typically words or characters) from a given text. Bigram models estimate the probability of a word based on the immediately preceding word.
Read entryMachine Learning
Binary Classification
Binary classification is a supervised learning task where the model assigns inputs to one of exactly two categories. Spam detection (spam vs. not spam) and medical diagnosis (positive vs. negative) are common examples.
Read entryDeep Learning
Boltzmann Machine
A Boltzmann machine is a stochastic recurrent neural network that can learn a probability distribution over its inputs. Restricted Boltzmann Machines (RBMs) were influential in the deep learning revolution as building blocks for deep belief networks.
Read entryMachine Learning
Boosting
Boosting is an ensemble method that trains models sequentially, with each new model focusing on correcting the errors of previous ones. Popular boosting algorithms include AdaBoost, Gradient Boosting, and XGBoost.
Read entryComputer Vision
Bounding Box
A bounding box is a rectangular border drawn around an object in an image to indicate its location and extent. Bounding boxes are the primary output format for object detection models.
Read entryNatural Language Processing
Byte Pair Encoding
Byte Pair Encoding (BPE) is a subword tokenization algorithm that iteratively merges the most frequent pairs of characters or character sequences. BPE is widely used in modern language models to handle rare words and multilingual text.
Read entryC
20Deep Learning
Catastrophic Forgetting
Catastrophic forgetting is the tendency of neural networks to abruptly lose previously learned knowledge when trained on new tasks. Continual learning research aims to overcome this limitation.
Read entryData Science
Causal Inference
Causal inference is the process of determining cause-and-effect relationships from data, going beyond mere correlation. AI systems increasingly use causal reasoning to make more robust and interpretable decisions.
Read entryGenerative AI
Chain of Thought
Chain of thought is a prompting technique that encourages large language models to break down complex reasoning into intermediate steps. This approach significantly improves performance on math, logic, and multi-step reasoning tasks.
Read entryAI Applications
Chatbot
A chatbot is a software application that simulates human conversation through text or voice interactions. Modern AI chatbots use large language models to generate contextually relevant, natural-sounding responses.
Read entryGenerative AI
ChatGPT
ChatGPT is an AI chatbot developed by OpenAI that uses large language models to generate human-like conversational responses. It became one of the fastest-growing consumer applications in history after its launch in November 2022.
Read entryMachine Learning
Classification
Classification is a supervised learning task where the model predicts which category or class an input belongs to. Examples include email spam detection, image recognition, and sentiment analysis.
Read entryGenerative AI
Claude
Claude is an AI assistant developed by Anthropic, designed to be helpful, harmless, and honest. It is built using Constitutional AI techniques and competes with models like GPT-4 and Gemini.
Read entryMachine Learning
Clustering
Clustering is an unsupervised learning technique that groups similar data points together without predefined labels. Common clustering algorithms include K-Means, DBSCAN, and hierarchical clustering.
Read entryDeep Learning
CNN
A CNN (Convolutional Neural Network) is a deep learning architecture designed to process grid-structured data like images. CNNs use convolutional filters to automatically learn spatial hierarchies of features.
Read entryComputer Vision
Computer Vision
Computer vision is a field of AI that enables machines to interpret and understand visual information from images and videos. Applications include facial recognition, autonomous driving, medical imaging, and augmented reality.
Read entryMachine Learning
Confusion Matrix
A confusion matrix is a table that summarizes a classification model's performance by showing true positives, true negatives, false positives, and false negatives. It provides a detailed breakdown beyond simple accuracy.
Read entryAI Ethics & Safety
Constitutional AI
Constitutional AI is an approach developed by Anthropic that trains AI systems to be helpful, harmless, and honest using a set of written principles. The model critiques and revises its own outputs based on these constitutional rules.
Read entryMachine Learning
Continual Learning
Continual learning is the ability of an AI system to learn new tasks or knowledge over time without forgetting previously learned information. It aims to create more human-like learning that accumulates knowledge incrementally.
Read entryDeep Learning
Contrastive Learning
Contrastive learning is a self-supervised technique that trains models to distinguish between similar and dissimilar data pairs. It learns useful representations by pulling similar examples closer and pushing dissimilar ones apart in embedding space.
Read entryDeep Learning
Convolutional Neural Network
A convolutional neural network is a specialized deep learning architecture that applies learned filters across input data to detect patterns. CNNs excel at image recognition, object detection, and visual understanding tasks.
Read entryNatural Language Processing
Corpus
A corpus is a large, structured collection of text documents used for training and evaluating natural language processing models. The quality and diversity of a training corpus significantly impacts model performance.
Read entryMachine Learning
Cross-Entropy
Cross-entropy is a loss function that measures the difference between two probability distributions — typically the model's predictions and the true labels. It is the standard loss function for classification tasks in deep learning.
Read entryData Science
Cross-Validation
Cross-validation is a model evaluation technique that splits data into multiple folds, training and testing on different subsets in rotation. K-fold cross-validation provides more reliable performance estimates than a single train-test split.
Read entryAI Infrastructure
CUDA
CUDA (Compute Unified Device Architecture) is NVIDIA's parallel computing platform that allows developers to use GPUs for general-purpose processing. CUDA is the foundation of GPU-accelerated deep learning training.
Read entryMachine Learning
Curriculum Learning
Curriculum learning is a training strategy that presents examples to a model in a meaningful order, starting with easier examples and progressively introducing harder ones. This mimics human learning and can improve convergence.
Read entryD
19Data Science
Data Augmentation
Data augmentation is a technique that artificially increases training dataset size by creating modified versions of existing data. In computer vision, this includes rotations, flips, and color changes; in NLP, it includes paraphrasing and synonym replacement.
Read entryData Science
Data Drift
Data drift occurs when the statistical properties of production data change over time compared to the training data. Drift can degrade model performance and requires monitoring and retraining strategies to address.
Read entryData Science
Data Labeling
Data labeling is the process of assigning meaningful tags or annotations to raw data to create supervised learning datasets. High-quality labeled data is essential for training accurate machine learning models.
Read entryAI Infrastructure
Data Lake
A data lake is a centralized storage repository that holds vast amounts of raw data in its native format. AI systems often draw training data from data lakes that store structured, semi-structured, and unstructured information.
Read entryAI Infrastructure
Data Pipeline
A data pipeline is an automated series of data processing steps that moves and transforms data from source systems to a destination. ML data pipelines handle ingestion, cleaning, feature engineering, and model training workflows.
Read entryAI Infrastructure
Data Warehouse
A data warehouse is a centralized repository for structured, processed data optimized for analysis and reporting. AI and ML systems often source their training data from enterprise data warehouses.
Read entryMachine Learning
Decision Boundary
A decision boundary is the surface or line that separates different classes in a classification model's feature space. The shape and complexity of decision boundaries depend on the model architecture and training data.
Read entryMachine Learning
Decision Tree
A decision tree is a supervised learning algorithm that makes predictions by learning a series of if-then rules from training data. Decision trees are interpretable and form the basis of powerful ensemble methods like Random Forest.
Read entryDeep Learning
Deep Learning
Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers to learn hierarchical representations of data. Deep learning has achieved breakthrough results in vision, language, and speech.
Read entryReinforcement Learning
Deep Reinforcement Learning
Deep reinforcement learning combines deep neural networks with reinforcement learning algorithms to handle complex, high-dimensional environments. It has achieved superhuman performance in games like Go, chess, and Atari.
Read entryAI Ethics & Safety
Deepfake
A deepfake is AI-generated synthetic media that convincingly replaces a person's likeness, voice, or actions in images, audio, or video. Deepfakes raise significant concerns about misinformation and identity fraud.
Read entryDeep Learning
Depthwise Separable Convolution
Depthwise separable convolution is an efficient convolution variant that factorizes a standard convolution into depthwise and pointwise operations. It dramatically reduces computation while maintaining accuracy, enabling mobile AI.
Read entryGenerative AI
Diffusion Model
A diffusion model is a generative AI model that creates data by learning to reverse a gradual noise-adding process. Diffusion models power state-of-the-art image generation systems like Stable Diffusion and DALL-E.
Read entryData Science
Dimensionality Reduction
Dimensionality reduction is the process of reducing the number of features in a dataset while preserving its essential structure. Techniques like PCA and t-SNE help with visualization, noise reduction, and computational efficiency.
Read entryGenerative AI
Discriminator
A discriminator is the component of a GAN that learns to distinguish between real and generated data. It provides feedback to the generator, creating an adversarial training dynamic that improves output quality.
Read entryDeep Learning
Distillation
Knowledge distillation is a model compression technique where a smaller student model learns to replicate the behavior of a larger teacher model. Distillation makes it possible to deploy powerful AI in resource-constrained environments.
Read entryAI Infrastructure
Distributed Training
Distributed training is the practice of splitting model training across multiple GPUs or machines to handle large models and datasets. It uses data parallelism or model parallelism to accelerate training.
Read entryDeep Learning
Dropout
Dropout is a regularization technique that randomly deactivates a fraction of neurons during training to prevent overfitting. It forces the network to learn redundant representations and improves generalization.
Read entryFundamentals
Dynamic Programming
Dynamic programming is an algorithmic technique that solves complex problems by breaking them into simpler overlapping subproblems. It is used in reinforcement learning, sequence alignment, and optimal control.
Read entryE
10AI Infrastructure
Edge AI
Edge AI refers to running artificial intelligence algorithms locally on hardware devices rather than in the cloud. Edge AI enables real-time inference with lower latency, better privacy, and reduced bandwidth requirements.
Read entryDeep Learning
Embedding
An embedding is a dense vector representation that captures the semantic meaning of data like words, sentences, or images in a continuous mathematical space. Similar items are mapped to nearby points, enabling semantic search and comparison.
Read entryFundamentals
Emergent Behavior
Emergent behavior refers to capabilities that appear in large AI models that were not explicitly trained for or predicted. Examples include in-context learning and chain-of-thought reasoning in large language models.
Read entryDeep Learning
Encoder-Decoder
An encoder-decoder is a neural network architecture where the encoder compresses input into a latent representation and the decoder generates output from it. This architecture is foundational for translation, summarization, and image captioning.
Read entryMachine Learning
Ensemble Learning
Ensemble learning combines multiple models to produce better predictions than any individual model alone. Techniques include bagging, boosting, and stacking, which reduce variance, bias, or both.
Read entryMachine Learning
Epoch
An epoch is one complete pass through the entire training dataset during model training. Training typically requires multiple epochs for the model to converge to good performance.
Read entryMachine Learning
Evaluation Metric
An evaluation metric is a quantitative measure used to assess model performance on a given task. Common metrics include accuracy, precision, recall, F1 score, AUC-ROC, and perplexity.
Read entryAI Ethics & Safety
Explainable AI
Explainable AI (XAI) encompasses techniques that make AI system decisions understandable to humans. XAI is crucial for building trust, meeting regulatory requirements, and debugging model behavior.
Read entryReinforcement Learning
Exploration vs Exploitation
Exploration vs exploitation is a fundamental dilemma in reinforcement learning between trying new actions to discover better rewards versus leveraging known good actions. Balancing both is key to optimal long-term performance.
Read entryNatural Language Processing
Extractive Summarization
Extractive summarization selects and combines the most important sentences directly from a source document to create a summary. It preserves the original wording but may lack the coherence of abstractive approaches.
Read entryF
12Machine Learning
F1 Score
The F1 score is the harmonic mean of precision and recall, providing a single metric that balances both. An F1 score of 1 indicates perfect precision and recall, while 0 indicates total failure.
Read entryComputer Vision
Face Recognition
Face recognition is a computer vision technology that identifies or verifies individuals by analyzing facial features in images or video. It is used in security systems, phone unlocking, and photo organization.
Read entryData Science
Feature Engineering
Feature engineering is the process of creating, selecting, and transforming input variables to improve machine learning model performance. Good feature engineering often matters more than model choice for traditional ML tasks.
Read entryMachine Learning
Feature Extraction
Feature extraction is the process of automatically identifying and selecting the most informative representations from raw data. Deep learning models learn to extract features hierarchically, from simple edges to complex patterns.
Read entryDeep Learning
Feature Map
A feature map is the output of applying a convolutional filter to an input, representing the presence and location of detected features. Deeper layers produce feature maps capturing increasingly abstract patterns.
Read entryAI Infrastructure
Feature Store
A feature store is a centralized repository for storing, managing, and serving machine learning features. It enables feature reuse, consistency between training and serving, and collaboration across ML teams.
Read entryMachine Learning
Federated Learning
Federated learning is a machine learning approach where models are trained across decentralized devices without sharing raw data. It enables privacy-preserving AI by keeping data on local devices while aggregating model updates.
Read entryMachine Learning
Few-Shot Learning
Few-shot learning is the ability of a model to learn and generalize from only a small number of labeled examples. Large language models demonstrate impressive few-shot capabilities through in-context learning.
Read entryGenerative AI
Few-Shot Prompting
Few-shot prompting provides a language model with a small number of input-output examples in the prompt to demonstrate the desired task format. This technique helps models understand task requirements without any fine-tuning.
Read entryDeep Learning
Fine-Tuning
Fine-tuning is the process of taking a pre-trained model and continuing training on a smaller, task-specific dataset. It adapts general knowledge to specialized domains while requiring far less data and compute than training from scratch.
Read entryGenerative AI
Foundation Model
A foundation model is a large AI model trained on broad data that can be adapted to a wide range of downstream tasks. GPT-4, Claude, Gemini, and DALL-E are examples of foundation models that serve as bases for specialized applications.
Read entryDeep Learning
Frozen Layers
Frozen layers are neural network layers whose weights are not updated during fine-tuning. Freezing preserves learned representations from pre-training while allowing later layers to adapt to new tasks.
Read entryG
16Generative AI
GAN
A GAN (Generative Adversarial Network) is a generative model consisting of two competing neural networks — a generator and a discriminator. GANs produce realistic synthetic data by training these networks in an adversarial game.
Read entryMachine Learning
Gaussian Process
A Gaussian process is a probabilistic model that defines a distribution over functions, providing both predictions and uncertainty estimates. Gaussian processes are used in Bayesian optimization and surrogate modeling.
Read entryGenerative AI
Gemini
Gemini is Google's family of multimodal AI models capable of processing text, images, audio, and video. It represents Google's most advanced AI system and competes with models like GPT-4 and Claude.
Read entryGenerative AI
Generative Adversarial Network
A Generative Adversarial Network is a deep learning framework where two neural networks compete: a generator creates synthetic data while a discriminator evaluates authenticity. This adversarial process produces remarkably realistic outputs.
Read entryGenerative AI
Generative AI
Generative AI refers to artificial intelligence systems that can create new content including text, images, music, code, and video. Technologies like GPT, DALL-E, and Stable Diffusion have made generative AI accessible to millions.
Read entryGenerative AI
Generative Model
A generative model learns the underlying data distribution and can create new data samples that resemble the training data. Examples include GANs, VAEs, diffusion models, and autoregressive language models.
Read entryGenerative AI
Generative Pre-trained Transformer
A Generative Pre-trained Transformer (GPT) is a type of large language model that generates text by predicting the next token in a sequence. Pre-trained on vast text corpora, GPT models exhibit broad language understanding and generation capabilities.
Read entryFundamentals
Genetic Algorithm
A genetic algorithm is an optimization technique inspired by natural selection that evolves solutions through selection, crossover, and mutation. It is used for complex optimization problems where gradient-based methods are impractical.
Read entryGenerative AI
GPT
GPT (Generative Pre-trained Transformer) is a series of large language models developed by OpenAI that generate human-quality text. GPT models are trained to predict the next token and can perform a wide range of language tasks.
Read entryAI Infrastructure
GPU
A GPU (Graphics Processing Unit) is a specialized processor designed for parallel computation that has become essential for training deep learning models. GPUs from NVIDIA dominate AI computing with architectures optimized for matrix operations.
Read entryDeep Learning
Gradient
A gradient is a vector of partial derivatives that indicates the direction and rate of steepest increase of a function. In neural networks, gradients are used to update weights in the direction that minimizes the loss function.
Read entryDeep Learning
Gradient Clipping
Gradient clipping is a technique that limits the magnitude of gradients during training to prevent exploding gradients. It is essential for stable training of deep networks and recurrent architectures.
Read entryDeep Learning
Gradient Descent
Gradient descent is an iterative optimization algorithm that adjusts model parameters in the direction that reduces the loss function. Variants include stochastic gradient descent (SGD), mini-batch SGD, and Adam.
Read entryDeep Learning
Graph Neural Network
A graph neural network (GNN) is a deep learning architecture designed to operate on graph-structured data like social networks, molecules, and knowledge graphs. GNNs learn by passing messages between connected nodes.
Read entryData Science
Ground Truth
Ground truth refers to the correct, verified labels or annotations in a dataset used to train and evaluate machine learning models. The quality of ground truth directly impacts model reliability.
Read entryNatural Language Processing
Grounding
Grounding in AI refers to connecting a model's language understanding to real-world knowledge, data, or sensory experience. Grounded AI systems produce more factual and contextually relevant outputs.
Read entryH
10Generative AI
Hallucination
Hallucination in AI refers to when a model generates plausible-sounding but factually incorrect or fabricated information. Reducing hallucinations is a major challenge for large language models used in high-stakes applications.
Read entryAI Applications
Hate Speech Detection
Hate speech detection is the AI task of automatically identifying harmful, abusive, or discriminatory language in text. It is a key component of content moderation systems on social media platforms.
Read entryFundamentals
Heuristic
A heuristic is a practical problem-solving approach that uses rules of thumb to find good-enough solutions efficiently. In AI search algorithms, heuristics guide exploration toward promising solutions.
Read entryDeep Learning
Hidden Layer
A hidden layer is any neural network layer between the input and output layers. Hidden layers progressively transform data into increasingly abstract representations that enable complex pattern recognition.
Read entryMachine Learning
Hierarchical Clustering
Hierarchical clustering is an unsupervised method that builds a tree-like hierarchy of nested clusters. It can be agglomerative (bottom-up merging) or divisive (top-down splitting) and produces a dendrogram visualization.
Read entryAI Infrastructure
Hugging Face
Hugging Face is a platform and community that provides open-source tools, pre-trained models, and datasets for natural language processing and machine learning. It has become the central hub for sharing and deploying AI models.
Read entryAI Applications
Human-in-the-Loop
Human-in-the-loop (HITL) is an approach where humans actively participate in the AI decision-making or training process. HITL systems combine human judgment with AI speed to improve accuracy and safety.
Read entryMachine Learning
Hyperparameter
A hyperparameter is a configuration value set before training that controls the learning process, such as learning rate, batch size, or number of layers. Unlike model parameters, hyperparameters are not learned from data.
Read entryMachine Learning
Hyperparameter Tuning
Hyperparameter tuning is the process of finding optimal hyperparameter values to maximize model performance. Methods include grid search, random search, and Bayesian optimization.
Read entryData Science
Hypothesis Testing
Hypothesis testing is a statistical method used to determine whether observed results are statistically significant or due to random chance. In AI, it helps validate whether model improvements are meaningful.
Read entryI
15Computer Vision
Image Captioning
Image captioning is the AI task of generating natural language descriptions of images. It requires both visual understanding (computer vision) and text generation (NLP) capabilities.
Read entryComputer Vision
Image Classification
Image classification is the computer vision task of assigning a label to an entire image based on its visual content. Deep learning models like ResNet and Vision Transformers achieve near-human accuracy on this task.
Read entryGenerative AI
Image Generation
Image generation is the AI task of creating new images from text prompts, sketches, or other inputs. Diffusion models and GANs are the leading approaches for photorealistic image synthesis.
Read entryComputer Vision
Image Segmentation
Image segmentation is the process of partitioning an image into meaningful regions or classifying each pixel into a category. It is used in medical imaging, autonomous driving, and satellite analysis.
Read entryReinforcement Learning
Imitation Learning
Imitation learning is a technique where an AI agent learns to perform tasks by observing and mimicking expert demonstrations. It bridges the gap between supervised learning and reinforcement learning.
Read entryData Science
Imputation
Imputation is the process of replacing missing data values with substituted values based on statistical methods or machine learning. Proper imputation prevents biased model training from incomplete datasets.
Read entryGenerative AI
In-Context Learning
In-context learning is the ability of large language models to learn new tasks from examples provided within the input prompt, without any parameter updates. This emergent capability enables flexible task adaptation at inference time.
Read entryMachine Learning
Inference
Inference is the process of using a trained model to make predictions on new, unseen data. Optimizing inference speed and cost is critical for deploying AI in production applications.
Read entryMachine Learning
Information Gain
Information gain measures the reduction in entropy achieved by splitting data on a particular feature. It is the primary criterion for building decision trees and feature selection.
Read entryAI Applications
Information Retrieval
Information retrieval is the science of searching and extracting relevant documents or data from large collections. Modern AI-powered search uses embeddings and language models to understand semantic meaning.
Read entryComputer Vision
Instance Segmentation
Instance segmentation is a computer vision task that identifies each object in an image and delineates its exact pixel boundary. Unlike semantic segmentation, it distinguishes between individual instances of the same class.
Read entryGenerative AI
Instruction Tuning
Instruction tuning is a fine-tuning process that trains language models to follow natural language instructions across diverse tasks. It greatly improves a model's ability to understand and execute user requests.
Read entryAI Applications
Intelligent Agent
An intelligent agent is an autonomous entity that observes its environment through sensors and acts upon it through actuators to achieve goals. Modern AI agents combine perception, reasoning, and action in complex workflows.
Read entryReinforcement Learning
Inverse Reinforcement Learning
Inverse reinforcement learning infers the reward function that an expert is optimizing by observing their behavior. It enables AI systems to learn goals and preferences from demonstrations.
Read entryAI Applications
IoT and AI
IoT and AI refers to the integration of artificial intelligence with Internet of Things devices to enable smart, autonomous decision-making at the edge. This combination powers smart homes, industrial IoT, and wearable health devices.
Read entryJ
3Data Science
Jaccard Index
The Jaccard index is a similarity metric that measures the overlap between two sets by dividing the size of their intersection by the size of their union. It is commonly used in object detection evaluation and text similarity.
Read entryAI Infrastructure
JAX
JAX is a Google-developed numerical computing library that combines NumPy-like syntax with automatic differentiation and GPU/TPU acceleration. JAX is increasingly popular for high-performance machine learning research.
Read entryData Science
Joint Probability
Joint probability is the probability of two or more events occurring simultaneously. Understanding joint probability distributions is fundamental to probabilistic machine learning and Bayesian inference.
Read entryK
6Machine Learning
K-Means Clustering
K-Means is an unsupervised clustering algorithm that partitions data into K groups by minimizing the distance between points and their assigned cluster centroids. It is one of the most widely used clustering methods.
Read entryMachine Learning
K-Nearest Neighbors
K-Nearest Neighbors (KNN) is a simple machine learning algorithm that classifies data points based on the majority class of their K closest neighbors. KNN requires no training phase but can be computationally expensive at inference.
Read entryMachine Learning
Kernel
A kernel is a function that computes similarity between data points, often used to map data into higher-dimensional spaces. Kernels enable Support Vector Machines and other algorithms to find non-linear decision boundaries.
Read entryDeep Learning
Knowledge Distillation
Knowledge distillation is a technique where a smaller model (student) is trained to mimic the outputs of a larger model (teacher). This transfers the teacher's knowledge into a more efficient model suitable for deployment.
Read entryAI Applications
Knowledge Graph
A knowledge graph is a structured representation of real-world entities and the relationships between them. AI systems use knowledge graphs to enhance reasoning, question answering, and recommendation systems.
Read entryFundamentals
Knowledge Representation
Knowledge representation is the field of AI concerned with encoding information about the world in a form that AI systems can use for reasoning. It includes ontologies, semantic networks, and logic-based formalisms.
Read entryL
12Machine Learning
Label
A label is the target output or ground truth annotation associated with a training example in supervised learning. Models learn to predict correct labels from input features during the training process.
Read entryNatural Language Processing
Language Model
A language model is an AI system that learns the probability distribution of sequences of words in a language. Modern language models like GPT and Claude can generate text, answer questions, and perform complex reasoning.
Read entryNatural Language Processing
Large Language Model
A Large Language Model (LLM) is a neural network with billions of parameters trained on massive text datasets to understand and generate human language. LLMs like GPT-4, Claude, and Gemini demonstrate broad capabilities across language tasks.
Read entryDeep Learning
Latent Space
Latent space is a compressed, lower-dimensional representation of data learned by a model. In generative AI, navigating latent space allows smooth interpolation between data points and controlled generation.
Read entryDeep Learning
Layer
A layer is a fundamental building block of a neural network that performs a specific transformation on its input. Common layer types include dense, convolutional, recurrent, and attention layers.
Read entryMachine Learning
Lazy Learning
Lazy learning is an approach where the model delays computation until a query is made rather than building a model during training. K-Nearest Neighbors is the most well-known lazy learning algorithm.
Read entryDeep Learning
Learning Rate
The learning rate is a hyperparameter that controls the step size during gradient descent optimization. Too high a learning rate causes instability, while too low a rate leads to slow convergence.
Read entryMachine Learning
Linear Regression
Linear regression is a statistical method that models the relationship between a dependent variable and one or more independent variables using a linear equation. It is one of the simplest and most interpretable ML algorithms.
Read entryMachine Learning
Logistic Regression
Logistic regression is a classification algorithm that uses a sigmoid function to model the probability of a binary outcome. Despite its name, it is a classification method rather than a regression technique.
Read entryGenerative AI
LoRA
LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique that adds small trainable matrices to frozen pre-trained model weights. LoRA dramatically reduces the memory and compute required for fine-tuning large models.
Read entryMachine Learning
Loss Function
A loss function is a mathematical function that quantifies how far a model's predictions are from the actual values. The model training process minimizes the loss function through optimization.
Read entryDeep Learning
LSTM
Long Short-Term Memory (LSTM) is a type of recurrent neural network architecture designed to learn long-range dependencies in sequential data. LSTMs use gate mechanisms to control information flow and avoid the vanishing gradient problem.
Read entryM
17Machine Learning
Machine Learning
Machine learning is a branch of artificial intelligence where systems learn patterns from data to make predictions or decisions without being explicitly programmed. It encompasses supervised, unsupervised, and reinforcement learning approaches.
Read entryFundamentals
Markov Chain
A Markov chain is a mathematical model describing a sequence of events where the probability of each event depends only on the current state. Markov chains are used in language modeling, page ranking, and MCMC sampling.
Read entryReinforcement Learning
Markov Decision Process
A Markov Decision Process (MDP) is a mathematical framework for modeling sequential decision-making problems with probabilistic outcomes. MDPs are the formal foundation for reinforcement learning algorithms.
Read entryComputer Vision
Masked Autoencoder
A masked autoencoder is a self-supervised learning method that masks random patches of an image and trains the model to reconstruct them. It has proven highly effective for pre-training vision models.
Read entryNatural Language Processing
Masked Language Model
A masked language model is a training approach where random tokens in a sentence are hidden and the model learns to predict them from context. BERT popularized masked language modeling as a pre-training objective.
Read entryMachine Learning
Meta-Learning
Meta-learning, or learning to learn, is an approach where AI systems learn how to quickly adapt to new tasks from limited data. Meta-learning algorithms optimize the learning process itself rather than just task performance.
Read entryReinforcement Learning
Minimax
Minimax is a decision-making algorithm used in adversarial settings where one player tries to maximize their score while the other minimizes it. It is the classical approach for game-playing AI systems.
Read entryDeep Learning
Mixture of Experts
Mixture of Experts (MoE) is an architecture that uses multiple specialized sub-networks (experts) and a gating mechanism to route inputs to the most relevant experts. MoE enables scaling model capacity without proportionally increasing compute.
Read entryAI Infrastructure
MLOps
MLOps (Machine Learning Operations) is the practice of applying DevOps principles to the machine learning lifecycle, including development, deployment, monitoring, and maintenance. MLOps ensures reliable, reproducible, and scalable ML systems.
Read entryAI Ethics & Safety
Model Card
A model card is a documentation framework that provides essential information about a machine learning model, including its intended use, performance metrics, limitations, and ethical considerations.
Read entryGenerative AI
Model Collapse
Model collapse is a phenomenon where AI models trained on AI-generated data progressively lose diversity and quality over generations. It highlights the importance of maintaining high-quality human-generated training data.
Read entryAI Infrastructure
Model Serving
Model serving is the process of deploying trained machine learning models to production environments where they can respond to prediction requests. Efficient serving requires optimization for latency, throughput, and cost.
Read entryFundamentals
Monte Carlo Method
Monte Carlo methods are computational algorithms that use repeated random sampling to estimate mathematical results. In AI, they are used in reinforcement learning, probabilistic inference, and tree search algorithms.
Read entryAI Applications
Multi-Agent System
A multi-agent system consists of multiple AI agents that interact, cooperate, or compete to solve complex problems. These systems model real-world scenarios like traffic management, markets, and collaborative robotics.
Read entryDeep Learning
Multi-Head Attention
Multi-head attention is a mechanism that runs multiple attention operations in parallel, allowing the model to attend to different aspects of the input simultaneously. It is a core component of the Transformer architecture.
Read entryMachine Learning
Multi-Task Learning
Multi-task learning is a training approach where a model learns to perform multiple related tasks simultaneously. Sharing representations across tasks often improves performance and data efficiency.
Read entryGenerative AI
Multimodal AI
Multimodal AI refers to systems that can process and understand multiple types of data, such as text, images, audio, and video. Models like GPT-4 and Gemini are multimodal, enabling richer human-AI interaction.
Read entryN
12Natural Language Processing
N-gram
An N-gram is a contiguous sequence of N items from a text, used in language modeling and text analysis. Unigrams, bigrams, and trigrams capture local word patterns and co-occurrence statistics.
Read entryNatural Language Processing
Named Entity Recognition
Named Entity Recognition (NER) is an NLP task that identifies and classifies named entities like people, organizations, locations, and dates in text. NER is a fundamental building block for information extraction.
Read entryNatural Language Processing
Natural Language Generation
Natural Language Generation (NLG) is the AI task of producing coherent, human-readable text from structured data or prompts. Large language models have made NLG remarkably fluent and contextually appropriate.
Read entryNatural Language Processing
Natural Language Inference
Natural Language Inference (NLI) is the task of determining whether a hypothesis is entailed by, contradicts, or is neutral to a given premise. NLI benchmarks test a model's understanding of logical relationships in text.
Read entryNatural Language Processing
Natural Language Processing
Natural Language Processing (NLP) is the field of AI focused on enabling machines to understand, interpret, and generate human language. NLP powers applications from chatbots and translation to sentiment analysis and search.
Read entryNatural Language Processing
Natural Language Understanding
Natural Language Understanding (NLU) is the subfield of NLP focused on machine reading comprehension — extracting meaning, intent, and context from text. NLU is essential for virtual assistants and conversational AI.
Read entryDeep Learning
Neural Architecture Search
Neural Architecture Search (NAS) is an automated process for discovering optimal neural network architectures for a given task. NAS uses search algorithms to explore vast design spaces that would be impractical to navigate manually.
Read entryDeep Learning
Neural Network
A neural network is a computing system inspired by biological neurons that processes information through interconnected layers of nodes. Neural networks are the foundation of deep learning and power most modern AI applications.
Read entryComputer Vision
Neural Radiance Field
A Neural Radiance Field (NeRF) is a deep learning method that represents 3D scenes as continuous functions, enabling photorealistic novel view synthesis from 2D images. NeRFs have transformed 3D reconstruction and rendering.
Read entryData Science
Noise
Noise in data science refers to random, irrelevant, or erroneous information in a dataset that can hinder model learning. Effective ML systems must distinguish meaningful signal from noise.
Read entryDeep Learning
Noise Injection
Noise injection is a regularization technique that adds random noise to inputs, weights, or gradients during training. It improves model robustness and generalization by preventing over-reliance on specific patterns.
Read entryData Science
Normalization
Normalization is the process of scaling input features to a standard range or distribution to improve model training. Common techniques include min-max scaling, z-score standardization, and layer normalization.
Read entryO
10Computer Vision
Object Detection
Object detection is a computer vision task that identifies and locates multiple objects within an image by predicting bounding boxes and class labels. YOLO, Faster R-CNN, and DETR are popular object detection models.
Read entryComputer Vision
Object Tracking
Object tracking is the computer vision task of following the movement of specific objects across consecutive frames in a video. It is essential for surveillance, autonomous driving, and sports analytics.
Read entryData Science
One-Hot Encoding
One-hot encoding is a technique that converts categorical variables into binary vectors where only one element is 1 and the rest are 0. It is a standard preprocessing step for feeding categorical data to machine learning models.
Read entryMachine Learning
One-Shot Learning
One-shot learning is the ability of a model to learn a new concept from just a single example. It is particularly important in applications like face verification where collecting many examples per person is impractical.
Read entryMachine Learning
Online Learning
Online learning is a training paradigm where the model updates its parameters incrementally as new data arrives, rather than retraining on the entire dataset. It is essential for streaming data and dynamic environments.
Read entryAI Infrastructure
Open Source AI
Open source AI refers to AI models, tools, and frameworks whose source code and weights are publicly available for use, modification, and distribution. Projects like LLaMA, Mistral, and PyTorch drive AI democratization.
Read entryGenerative AI
OpenAI
OpenAI is an AI research company that created ChatGPT, GPT-4, and DALL-E. Founded in 2015, it has been instrumental in advancing large language models and bringing generative AI to mainstream adoption.
Read entryComputer Vision
Optical Character Recognition
Optical Character Recognition (OCR) is the technology that converts images of text into machine-readable text data. Modern OCR uses deep learning to handle diverse fonts, handwriting, and document layouts.
Read entryMachine Learning
Optimization
Optimization in machine learning is the process of adjusting model parameters to minimize (or maximize) an objective function. Gradient-based optimization methods are the backbone of neural network training.
Read entryMachine Learning
Overfitting
Overfitting occurs when a model learns the training data too well, including its noise and outliers, and fails to generalize to new data. Regularization, dropout, and early stopping are common strategies to combat overfitting.
Read entryP
19Computer Vision
Panoptic Segmentation
Panoptic segmentation unifies semantic and instance segmentation by assigning every pixel a semantic class and an instance identity. It provides a complete understanding of scene composition.
Read entryMachine Learning
Parameter
A parameter is a learnable variable within a model that is adjusted during training, such as weights and biases in a neural network. Large language models contain billions of parameters.
Read entryGenerative AI
Parameter-Efficient Fine-Tuning
Parameter-Efficient Fine-Tuning (PEFT) refers to techniques that adapt large models by updating only a small subset of parameters. Methods like LoRA, adapters, and prefix tuning enable fine-tuning with minimal compute.
Read entryDeep Learning
Perceptron
A perceptron is the simplest type of artificial neural network consisting of a single neuron that computes a weighted sum of inputs and applies a threshold function. It is the fundamental building block of more complex networks.
Read entryNatural Language Processing
Perplexity
Perplexity is a metric that measures how well a language model predicts a text sequence — lower perplexity indicates better prediction. It is also the name of an AI-powered search engine that provides cited, conversational answers.
Read entryAI Infrastructure
Pipeline
A pipeline in ML is a sequence of data processing and modeling steps chained together to automate a workflow. ML pipelines include data preprocessing, feature engineering, model training, and evaluation stages.
Read entryReinforcement Learning
Policy
A policy in reinforcement learning is a function that maps states to actions, defining the agent's behavior strategy. The goal of RL is to learn an optimal policy that maximizes cumulative reward.
Read entryComputer Vision
Pose Estimation
Pose estimation is the computer vision task of detecting the position and orientation of a person's body joints in images or video. It enables applications in fitness tracking, motion capture, and human-computer interaction.
Read entryDeep Learning
Positional Encoding
Positional encoding adds information about token position to input embeddings in Transformer models, which otherwise have no inherent sense of sequence order. This enables the model to understand word order and sentence structure.
Read entryDeep Learning
Pre-training
Pre-training is the initial phase of training a model on a large, general dataset before fine-tuning on specific tasks. Pre-training enables models to learn broad language or visual understanding that transfers to many applications.
Read entryMachine Learning
Precision
Precision is a classification metric measuring the proportion of true positive predictions among all positive predictions. High precision means few false positives, which is important when the cost of false alarms is high.
Read entryMachine Learning
Prediction
Prediction is the output of a trained model when given new input data. Machine learning predictions can be categorical (classification), numerical (regression), or generative (text, images).
Read entryData Science
Principal Component Analysis
Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms data into a new coordinate system where the greatest variance lies along the first coordinates. PCA is widely used for data visualization and noise reduction.
Read entryGenerative AI
Prompt
A prompt is the input text or instruction given to a language model to elicit a desired response. The quality and specificity of prompts significantly influence the relevance and accuracy of AI-generated outputs.
Read entryGenerative AI
Prompt Chaining
Prompt chaining is a technique where the output of one language model call becomes the input for the next, creating a pipeline of AI reasoning steps. It enables complex workflows that exceed what a single prompt can accomplish.
Read entryGenerative AI
Prompt Engineering
Prompt engineering is the practice of designing and optimizing input prompts to get the best possible responses from AI language models. Techniques include few-shot examples, chain of thought, and structured formatting.
Read entryAI Ethics & Safety
Prompt Injection
Prompt injection is a security vulnerability where malicious instructions embedded in user input override or manipulate an AI system's intended behavior. Defending against prompt injection is an active area of AI security research.
Read entryDeep Learning
Pruning
Pruning is a model compression technique that removes unnecessary weights or neurons from a neural network to reduce its size and computational cost. Pruned models can be significantly smaller while maintaining most of their accuracy.
Read entryAI Infrastructure
PyTorch
PyTorch is an open-source deep learning framework developed by Meta that provides flexible tensor computation with GPU acceleration. It is the most popular framework for AI research due to its intuitive design and dynamic computation graphs.
Read entryQ
4Reinforcement Learning
Q-Learning
Q-learning is a model-free reinforcement learning algorithm that learns the value of actions in states to find an optimal policy. It uses a Q-table or neural network to estimate expected cumulative rewards for each state-action pair.
Read entryAI Infrastructure
Quantization
Quantization is a technique that reduces model size and speeds up inference by converting high-precision weights to lower precision (e.g., 32-bit to 4-bit). It enables large models to run on consumer hardware with minimal accuracy loss.
Read entryDeep Learning
Query in Attention
In the attention mechanism, a query is a vector representing what information the current position is looking for. Queries interact with keys and values to compute attention weights that determine which parts of the input to focus on.
Read entryNatural Language Processing
Question Answering
Question answering is the NLP task of automatically generating answers to questions posed in natural language. Modern QA systems range from extractive (finding answers in text) to generative (producing new answer text).
Read entryR
19Machine Learning
Random Forest
Random Forest is an ensemble learning method that trains multiple decision trees on random data subsets and combines their predictions through voting. It is robust, requires minimal tuning, and handles both classification and regression.
Read entryMachine Learning
Recall
Recall is a classification metric measuring the proportion of actual positives that were correctly identified by the model. High recall is critical in medical diagnosis and other applications where missing true positives is costly.
Read entryAI Applications
Recommendation System
A recommendation system is an AI application that predicts and suggests items a user might be interested in. Netflix, Spotify, and Amazon use recommendation systems powered by collaborative filtering and deep learning.
Read entryDeep Learning
Recurrent Neural Network
A Recurrent Neural Network (RNN) is a neural architecture designed for sequential data that maintains a hidden state across time steps. While largely superseded by Transformers, RNNs introduced the concept of memory in neural networks.
Read entryMachine Learning
Regression
Regression is a supervised learning task where the model predicts a continuous numerical value rather than a category. Examples include predicting house prices, stock returns, and temperature forecasts.
Read entryMachine Learning
Regularization
Regularization is a set of techniques that prevent overfitting by adding constraints or penalties to the model during training. Common methods include L1/L2 regularization, dropout, and early stopping.
Read entryReinforcement Learning
Reinforcement Learning
Reinforcement learning is a machine learning paradigm where an agent learns to make decisions by receiving rewards or penalties for its actions in an environment. It has achieved breakthroughs in game playing, robotics, and AI alignment.
Read entryGenerative AI
Reinforcement Learning from Human Feedback
RLHF is a training technique that uses human preferences to fine-tune AI models, aligning their outputs with human values and expectations. RLHF is key to making language models helpful, harmless, and honest.
Read entryDeep Learning
Representation Learning
Representation learning is the automatic discovery of useful data representations needed for machine learning tasks. Deep learning is fundamentally a form of representation learning that builds hierarchical feature abstractions.
Read entryDeep Learning
Residual Network
A Residual Network (ResNet) is a deep neural network architecture that uses skip connections to enable training of very deep networks. ResNets solved the vanishing gradient problem and enabled networks with hundreds of layers.
Read entryAI Ethics & Safety
Responsible AI
Responsible AI encompasses practices and principles for developing AI systems that are fair, transparent, accountable, and beneficial to society. It addresses bias, privacy, safety, and the broader social impact of AI technology.
Read entryGenerative AI
Retrieval-Augmented Generation
Retrieval-Augmented Generation (RAG) is a technique that enhances language model responses by first retrieving relevant documents from a knowledge base. RAG reduces hallucinations and keeps AI responses grounded in up-to-date, factual information.
Read entryReinforcement Learning
Reward Model
A reward model is a trained model that predicts human preferences between different AI outputs, providing a scalar reward signal. Reward models are central to RLHF and are used to align language models with human values.
Read entryReinforcement Learning
Reward Shaping
Reward shaping is the practice of designing intermediate reward signals to guide reinforcement learning agents toward desired behaviors more efficiently. Good reward shaping accelerates training while avoiding unintended shortcuts.
Read entryDeep Learning
RNN
An RNN (Recurrent Neural Network) is a class of neural networks where connections between nodes form cycles, allowing the network to maintain temporal state. While effective for sequences, RNNs struggle with long-range dependencies compared to Transformers.
Read entryRobotics & Automation
Robotic Process Automation
Robotic Process Automation (RPA) uses software robots to automate repetitive, rule-based business tasks like data entry and form processing. AI-enhanced RPA can handle unstructured data and make intelligent decisions.
Read entryRobotics & Automation
Robotics
Robotics is the field of engineering and AI focused on designing, building, and programming robots that can interact with the physical world. AI-powered robotics combines computer vision, planning, and motor control.
Read entryAI Ethics & Safety
Robustness
Robustness in AI refers to a model's ability to maintain performance when faced with unexpected inputs, adversarial attacks, or distribution shifts. Building robust AI systems is essential for reliable real-world deployment.
Read entryMachine Learning
ROC Curve
A ROC (Receiver Operating Characteristic) curve plots the true positive rate against the false positive rate at various classification thresholds. The area under the ROC curve (AUC) is a widely used metric for classifier performance.
Read entryS
23Generative AI
Sampling
Sampling in generative AI is the process of selecting tokens from a probability distribution during text generation. Different sampling strategies like top-k and top-p control the randomness and creativity of outputs.
Read entryDeep Learning
Scaling Laws
Scaling laws are empirical relationships showing how model performance improves predictably with increases in model size, data, and compute. They guide decisions about resource allocation in training large AI models.
Read entryDeep Learning
Self-Attention
Self-attention is an attention mechanism where each element in a sequence computes attention scores with every other element in the same sequence. It enables Transformers to capture long-range dependencies regardless of distance.
Read entryMachine Learning
Self-Supervised Learning
Self-supervised learning is a training approach where models generate their own supervisory signals from unlabeled data. Pre-training large language models with next-token prediction is a form of self-supervised learning.
Read entryAI Applications
Semantic Search
Semantic search uses AI to understand the meaning and intent behind queries rather than just matching keywords. It leverages embeddings and language models to return results that are conceptually relevant.
Read entryNatural Language Processing
Semantic Similarity
Semantic similarity is a measure of how closely two pieces of text convey the same meaning. AI computes semantic similarity using vector embeddings, enabling applications like duplicate detection and recommendation.
Read entryMachine Learning
Semi-Supervised Learning
Semi-supervised learning uses a combination of a small amount of labeled data and a large amount of unlabeled data for training. It bridges the gap between supervised and unsupervised learning.
Read entryNatural Language Processing
Sentiment Analysis
Sentiment analysis is the NLP task of determining the emotional tone or opinion expressed in text — positive, negative, or neutral. It is widely used in brand monitoring, customer feedback analysis, and social media analytics.
Read entryDeep Learning
Sequence-to-Sequence
Sequence-to-sequence (Seq2Seq) is a model architecture that transforms one sequence into another, used in translation, summarization, and dialogue. It consists of an encoder that reads the input and a decoder that generates the output.
Read entryAI Ethics & Safety
SHAP
SHAP (SHapley Additive exPlanations) is an explainability method based on game theory that assigns each feature an importance value for a particular prediction. SHAP provides consistent, locally accurate explanations for any ML model.
Read entryDeep Learning
Sigmoid Function
The sigmoid function is an activation function that maps any input to a value between 0 and 1, making it useful for binary classification outputs. It has been largely replaced by ReLU in hidden layers but remains standard for output layers.
Read entryRobotics & Automation
Sim-to-Real Transfer
Sim-to-real transfer is the process of training AI models in simulation and deploying them in the real world. It is crucial in robotics where real-world training is expensive, slow, or dangerous.
Read entryDeep Learning
Softmax
Softmax is a function that converts a vector of raw scores into a probability distribution where all values sum to 1. It is the standard output activation for multi-class classification and attention mechanisms.
Read entryDeep Learning
Sparse Model
A sparse model activates only a subset of its parameters for each input, reducing computational cost while maintaining capacity. Mixture of Experts and pruned networks are common sparse model architectures.
Read entryAI Applications
Speech Recognition
Speech recognition is the AI capability of converting spoken language into text. Modern systems like Whisper use deep learning to achieve near-human accuracy across multiple languages.
Read entryGenerative AI
Stable Diffusion
Stable Diffusion is an open-source AI image generation model that creates images from text descriptions using a latent diffusion process. Its open nature has spurred a large community of developers and artists.
Read entryDeep Learning
Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is an optimization algorithm that updates model weights using the gradient computed from a random subset (mini-batch) of training data. SGD is computationally efficient and adds beneficial noise that helps escape local minima.
Read entryComputer Vision
Style Transfer
Style transfer is a computer vision technique that applies the artistic style of one image to the content of another. Neural style transfer uses deep learning to separate and recombine content and style representations.
Read entryMachine Learning
Supervised Learning
Supervised learning is a machine learning approach where models learn from labeled training data — input-output pairs. It is the most common ML paradigm, powering classification and regression tasks.
Read entryMachine Learning
Support Vector Machine
A Support Vector Machine (SVM) is a classification algorithm that finds the optimal hyperplane separating different classes with maximum margin. SVMs are effective for high-dimensional data and small datasets.
Read entryRobotics & Automation
Swarm Intelligence
Swarm intelligence is a collective behavior that emerges from groups of simple agents following local rules, inspired by natural systems like ant colonies and bird flocks. It is used in optimization and multi-robot coordination.
Read entryData Science
Synthetic Data
Synthetic data is artificially generated data that mimics the statistical properties of real-world data. It is used to augment training datasets, protect privacy, and test models when real data is scarce or sensitive.
Read entryGenerative AI
System Prompt
A system prompt is a hidden instruction given to a language model that defines its behavior, persona, and constraints for a conversation. System prompts shape how AI assistants respond without being visible to end users.
Read entryT
19Generative AI
Temperature
Temperature is a parameter in language model text generation that controls the randomness of output. Lower temperatures produce more deterministic, focused responses, while higher temperatures increase creativity and diversity.
Read entryDeep Learning
Tensor
A tensor is a multi-dimensional array of numbers that serves as the fundamental data structure in deep learning. Scalars, vectors, and matrices are all specific cases of tensors.
Read entryAI Infrastructure
TensorFlow
TensorFlow is an open-source machine learning framework developed by Google that provides tools for building and deploying ML models. It supports distributed training, mobile deployment, and production serving.
Read entryNatural Language Processing
Text Classification
Text classification is the NLP task of assigning predefined categories to text documents. Applications include spam filtering, topic labeling, and content moderation.
Read entryGenerative AI
Text Generation
Text generation is the AI task of producing coherent, contextually relevant text, typically through autoregressive language models. Modern text generation powers chatbots, creative writing tools, and code assistants.
Read entryGenerative AI
Text-to-Image
Text-to-image generation creates visual images from natural language descriptions using AI models like DALL-E, Midjourney, and Stable Diffusion. It has transformed creative workflows and content production.
Read entryAI Applications
Text-to-Speech
Text-to-speech (TTS) is the AI technology that converts written text into natural-sounding spoken audio. Modern TTS systems produce remarkably human-like voices with appropriate prosody and emotion.
Read entryNatural Language Processing
Token
A token is the basic unit of text that a language model processes, which can be a word, subword, or character depending on the tokenizer. GPT-4 processes text in tokens, with roughly 4 characters per token in English.
Read entryNatural Language Processing
Tokenization
Tokenization is the process of splitting text into tokens that a language model can process. Modern tokenizers like BPE and SentencePiece balance vocabulary size with the ability to represent any text sequence.
Read entryGenerative AI
Tool Use
Tool use in AI refers to a language model's ability to interact with external tools like calculators, web browsers, code interpreters, and APIs. Tool use extends AI capabilities beyond pure text generation.
Read entryGenerative AI
Top-k Sampling
Top-k sampling is a text generation strategy that restricts token selection to the k most probable next tokens. It prevents the model from selecting highly unlikely tokens while maintaining output diversity.
Read entryGenerative AI
Top-p Sampling
Top-p sampling (nucleus sampling) selects from the smallest set of tokens whose cumulative probability exceeds a threshold p. It dynamically adjusts the candidate pool size based on the model's confidence.
Read entryAI Infrastructure
TPU
A TPU (Tensor Processing Unit) is a custom-designed AI accelerator chip developed by Google specifically for neural network computations. TPUs power Google's AI services and are available through Google Cloud.
Read entryData Science
Training Data
Training data is the dataset used to teach a machine learning model to recognize patterns and make predictions. The quality, quantity, and representativeness of training data fundamentally determine model capabilities.
Read entryMachine Learning
Transfer Learning
Transfer learning is the technique of applying knowledge gained from training on one task to improve performance on a different but related task. It enables powerful AI models from limited domain-specific data by leveraging pre-trained knowledge.
Read entryDeep Learning
Transformer
The Transformer is a neural network architecture based on self-attention mechanisms that processes all input positions in parallel. Introduced in 2017, it became the foundation for virtually all modern large language models and many vision models.
Read entryAI Ethics & Safety
Trustworthy AI
Trustworthy AI is an approach to building AI systems that are reliable, fair, transparent, privacy-preserving, and safe. It encompasses technical, ethical, and governance dimensions of AI development.
Read entryFundamentals
Turing Test
The Turing Test is a measure of machine intelligence proposed by Alan Turing where a human judge evaluates whether they are conversing with a human or a machine. If the judge cannot reliably distinguish, the machine is said to exhibit intelligent behavior.
Read entryData Science
Type I and Type II Error
Type I error (false positive) occurs when a model incorrectly predicts a positive outcome, while Type II error (false negative) incorrectly predicts a negative. Understanding these errors is crucial for evaluating model performance in context.
Read entryU
4Machine Learning
Underfitting
Underfitting occurs when a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data. Increasing model complexity or training longer can address underfitting.
Read entryMachine Learning
Unsupervised Learning
Unsupervised learning is a machine learning approach where models discover patterns and structure in data without labeled examples. Clustering, dimensionality reduction, and anomaly detection are common unsupervised tasks.
Read entryDeep Learning
Unsupervised Pre-training
Unsupervised pre-training is the process of training a model on unlabeled data to learn general representations before fine-tuning on labeled data. It is the foundation of modern foundation models and transfer learning.
Read entryDeep Learning
Upsampling
Upsampling is a technique that increases the spatial resolution of data, commonly used in image generation and segmentation to produce higher-resolution outputs. Transposed convolutions and interpolation are common upsampling methods.
Read entryV
6Data Science
Validation Set
A validation set is a portion of data held out from training to evaluate model performance during development and tune hyperparameters. It helps detect overfitting and guides model selection before final testing.
Read entryDeep Learning
Vanishing Gradient
The vanishing gradient problem occurs when gradients become extremely small during backpropagation through many layers, making it difficult to train deep networks. Skip connections and normalization techniques were developed to address this issue.
Read entryGenerative AI
Variational Autoencoder
A Variational Autoencoder (VAE) is a generative model that learns a probabilistic latent representation of data. VAEs can generate new data by sampling from the learned latent space distribution.
Read entryFundamentals
Vector
A vector is an ordered array of numbers that represents a point or direction in multi-dimensional space. In AI, vectors (embeddings) encode the semantic meaning of words, images, and other data types.
Read entryAI Infrastructure
Vector Database
A vector database is a specialized storage system optimized for storing, indexing, and querying high-dimensional vector embeddings. It powers semantic search, recommendation systems, and RAG applications.
Read entryComputer Vision
Vision Transformer
A Vision Transformer (ViT) applies the Transformer architecture to image recognition by treating image patches as tokens. ViTs have matched or exceeded CNNs on many computer vision benchmarks.
Read entryW
5AI Ethics & Safety
Watermarking
AI watermarking is the technique of embedding hidden, detectable signals in AI-generated content to identify its origin. It helps distinguish AI-generated text and images from human-created content.
Read entryDeep Learning
Weight
A weight is a numerical parameter in a neural network that determines the strength of the connection between neurons. Weights are learned during training through backpropagation and gradient descent.
Read entryDeep Learning
Weight Initialization
Weight initialization is the strategy for setting initial values of neural network weights before training begins. Proper initialization (like Xavier or He initialization) prevents vanishing or exploding gradients.
Read entryNatural Language Processing
Word Embedding
A word embedding is a dense vector representation of a word that captures its semantic meaning and relationships to other words. Words with similar meanings are mapped to nearby points in embedding space.
Read entryNatural Language Processing
Word2Vec
Word2Vec is a pioneering neural network model that learns word embeddings from large text corpora. Developed by Google in 2013, it demonstrated that vector arithmetic on word embeddings captures semantic relationships.
Read entryX
2AI Ethics & Safety
XAI
XAI (Explainable Artificial Intelligence) refers to methods and techniques that make AI decision-making processes transparent and interpretable to humans. XAI builds trust and enables accountability in AI systems.
Read entryMachine Learning
XGBoost
XGBoost (Extreme Gradient Boosting) is a highly optimized gradient boosting library known for its speed and performance in structured data competitions. It remains one of the most popular algorithms for tabular data.
Read entryZ
2Machine Learning
Zero-Shot Learning
Zero-shot learning is the ability of a model to correctly handle tasks or recognize classes it has never been explicitly trained on. Large language models demonstrate strong zero-shot capabilities across diverse tasks.
Read entryGenerative AI
Zero-Shot Prompting
Zero-shot prompting is providing a language model with task instructions and no examples, relying on its pre-trained knowledge to perform the task. It tests the model's ability to generalize from training to novel instructions.
Read entry