Home /Resources /AI Glossary
The Complete
AI Glossary
Every AI term your business will encounter — defined in plain English for decision-makers and practitioners alike. Search, filter by category, and jump straight to what you need.
Showing all terms
A step-by-step set of rules or instructions a computer follows to solve a problem or make a decision. In AI, algorithms learn patterns from data rather than following hard-coded rules written by humans.
Real-world example: A loan-approval algorithm learns from thousands of past applications which factors (income, debt ratio, credit history) predict repayment — then applies those learned patterns to new applicants automatically, in milliseconds.
A technique that lets AI models focus on the most relevant parts of their input when producing an output, rather than treating every word or data point equally. It is the core innovation behind modern language models (Transformers).
Real-world example: When translating “The bank was steep,” the attention mechanism lets the model focus on surrounding words (river, path) to correctly deduce “bank” means a riverbank — not a financial institution.
An AI system that can independently perceive its environment, make decisions, and take multi-step actions to achieve a goal without human input at every stage. Modern agents can use tools, browse the web, write code, and call APIs.
Real-world example: A sales AI agent monitors your CRM, drafts personalised outreach emails, schedules follow-up reminders, and updates deal stages — completing an entire workflow overnight without human supervision.
The automatic identification of data points, events, or patterns that deviate significantly from expected norms. Used extensively in fraud detection, quality control, cybersecurity, and predictive maintenance.
Real-world example: A bank’s anomaly detection model flags a card normally used in Sydney that suddenly appears in Lagos at 3 am — triggering an automatic hold before any funds are lost.
The simulation of human intelligence processes by machines. AI systems can learn from data, reason through problems, and perform tasks that historically required human cognition — such as recognising speech, making decisions, or translating languages.
Systematic errors in an AI model’s outputs that result from prejudiced assumptions baked into the training data or model design. AI bias can lead to unfair outcomes for particular groups in hiring, lending, healthcare, and more.
Real-world example: A CV-screening AI trained on 10 years of historical hires (predominantly male) learns to rank male applicants higher — not because they’re more qualified, but because the data encoded past discrimination.
A standardised test or dataset used to evaluate and compare the performance of AI models. Benchmarks let teams measure whether a new model is genuinely better than existing ones, and by how much.
An AI model whose internal workings are too complex for humans to interpret. It accepts inputs and produces outputs, but it’s impossible to explain exactly why it made a specific decision. Contrasted with explainable AI (XAI).
Real-world example: A deep neural network that predicts credit risk may be highly accurate but unable to explain why it rejected a specific applicant — creating regulatory challenges in finance and healthcare.
A machine learning task where the model assigns inputs to one of several predefined categories. One of the most common AI tasks in business — from spam detection to medical diagnosis.
Real-world example: An email classifier assigns each message to “spam” or “not spam.” A radiology AI classifies scans as “tumour detected” or “clear.”
The maximum amount of text (measured in tokens) that a language model can process in a single request — both the input it receives and the output it generates. Larger context windows allow models to handle longer documents and conversations.
Real-world example: A 128K-token context window can process roughly 100,000 words in one go — enough to analyse an entire legal contract library or a year’s worth of customer support tickets at once.
An unsupervised learning technique that groups similar data points together without pre-assigned labels. The model finds natural groupings in the data on its own.
Real-world example: A retail business uses clustering to segment customers into distinct groups (bargain hunters, brand loyalists, seasonal buyers) based purely on purchase behaviour — without manually defining those categories first.
A field of AI that enables machines to interpret and understand visual information from images and video. Modern computer vision models can detect objects, recognise faces, read text, and analyse medical scans.
A centralised repository that stores large volumes of raw data in its native format until it’s needed — structured tables, unstructured text, images, video, and more. Contrasted with a data warehouse, which stores only cleaned, processed data.
A subset of machine learning that uses neural networks with many layers to learn extremely complex patterns from large amounts of data. “Deep” refers to the many layers in the network. Powers most modern AI breakthroughs — image recognition, voice assistants, LLMs.
The degradation of an AI model’s performance over time because the real-world data it encounters changes from the data it was trained on. Left unmanaged, drift silently erodes accuracy.
Real-world example: A fraud detection model trained before COVID-19 might drift dramatically afterwards as spending patterns changed entirely — causing it to miss new fraud patterns it was never trained to recognise.
The process of pulling data from source systems, cleaning and restructuring it, then loading it into a target system (a data warehouse or feature store) where it can be used by AI models. Good ETL is the foundation of reliable AI.
AI techniques and methods that make model decisions transparent and understandable to humans — explaining what factors drove a specific output. Critical for regulated industries and for building user trust.
Real-world example: An XAI-powered loan system doesn’t just reject an application — it tells the applicant: “Your application was declined primarily due to a debt-to-income ratio above 45%.”
The process of selecting, transforming, and creating input variables (features) from raw data to improve a machine learning model’s performance. Often more impactful than choosing the model itself.
Real-world example: For a churn prediction model, raw data might include “last login date.” A better feature is “days since last login” or “login frequency change vs. last quarter” — capturing the behaviour pattern that actually matters.
The process of taking a pre-trained AI model and training it further on a specific dataset for a specific task. Fine-tuning allows organisations to customise foundation models for their industry, tone of voice, and use case — without training from scratch.
Real-world example: A legal firm fine-tunes an LLM on 10 years of its own contracts and case notes — producing a model that understands their specific clause language and jurisdiction.
Using historical data and patterns to predict future values — such as demand, revenue, stock levels, or equipment failure probability. Time-series forecasting is one of the highest-ROI AI applications in business.
A class of AI models that can generate new content — text, images, code, audio, video — rather than just classifying or predicting existing information. Powered by foundation models like GPT-4, Claude, and Gemini.
The policies, processes, and frameworks an organisation uses to ensure AI systems are developed and used responsibly — covering accountability, transparency, fairness, data privacy, and regulatory compliance.
The practice of connecting an LLM’s outputs to verified, real-world data sources to prevent hallucinations. Grounded models cite sources and are constrained to only claim what the evidence supports.
When an AI model generates false information confidently and fluently — inventing facts, citations, or events that do not exist. A key risk in enterprise LLM deployments, managed through grounding and retrieval-augmented generation (RAG).
Real-world example: An LLM asked to summarise a legal case might invent a plausible-sounding but entirely fictional prior ruling — which is why human review and grounding mechanisms are essential in high-stakes applications.
The process of optimising a model’s configuration settings (hyperparameters) — such as learning rate, number of layers, or batch size — that are set before training begins. Good tuning significantly improves model accuracy.
The process of running a trained AI model on new data to generate predictions or outputs. Distinct from training: training is learning, inference is applying what was learned. The speed and cost of inference are critical production concerns.
A structured representation of information as a network of entities and the relationships between them. Knowledge graphs help AI systems reason about complex, interconnected domains like products, organisations, and medical conditions.
The process of adding meaningful tags or labels to raw data so a supervised learning model can learn from it. High-quality labelling is often the most expensive and time-consuming part of an AI project.
A type of AI model trained on vast quantities of text that can understand, generate, and reason about language. LLMs power chatbots, coding assistants, summarisation tools, and much more. Examples include GPT-4, Claude, and Gemini.
A branch of AI where systems learn from data to improve their performance on a task — without being explicitly programmed with rules for every scenario. The model learns patterns from examples and generalises them to new situations.
A set of practices combining machine learning with software operations (DevOps) to deploy, monitor, and maintain AI models reliably in production. MLOps covers CI/CD for models, data pipelines, monitoring, retraining triggers, and version control.
Real-world example: Without MLOps, a fraud model might silently degrade as fraud patterns evolve. With MLOps, drift is detected automatically, a retraining job is triggered, and the improved model is deployed — all without a human spotting the problem.
When a model learns the training data too precisely — including its noise and quirks — and performs poorly on new, unseen data. The model has memorised rather than generalised. Overfitting is one of the most common problems in ML.
AI systems that can understand and generate multiple types of data simultaneously — text, images, audio, video, and more. Multimodal models can analyse a photo and describe it, or listen to speech and respond in text.
A computing system loosely inspired by the human brain, consisting of layers of interconnected nodes (“neurons”) that process information. Neural networks learn by adjusting the strength of connections based on feedback from data.
A branch of AI focused on enabling computers to understand, interpret, and generate human language. NLP powers chatbots, voice assistants, translation systems, sentiment analysis, and document intelligence.
The process of adjusting a model’s parameters to minimise error (or maximise performance) during training. Gradient descent is the most common optimisation algorithm, iteratively nudging the model towards better predictions.
The craft of designing input instructions (prompts) that elicit the best possible output from a language model. Well-engineered prompts dramatically improve accuracy, reduce hallucinations, and constrain model behaviour to appropriate outputs.
Using historical data, statistical models, and machine learning to forecast future events or behaviours. Common business applications include customer churn prediction, demand forecasting, fraud probability scoring, and equipment failure prediction.
A large AI model trained on enormous datasets at massive cost by major AI labs (OpenAI, Anthropic, Google). Organisations build on top of these foundation models rather than training from scratch — dramatically reducing cost and time.
An architecture that enhances LLM responses by first retrieving relevant documents from a knowledge base, then using those documents as context when generating the answer. RAG dramatically reduces hallucinations and keeps responses grounded in verified information.
Real-world example: A customer support chatbot using RAG retrieves the exact product manual section before answering — instead of guessing — ensuring every answer is factually accurate and citable.
A type of supervised learning that predicts a continuous numerical value, as opposed to classification (which predicts a category). Used for forecasting prices, estimating demand, projecting revenue, and many other business applications.
A type of machine learning where an agent learns by taking actions in an environment and receiving rewards or penalties. It learns to maximise cumulative reward over time through trial and error. Used in robotics, game-playing AI, and recommendation systems.
The measurable business value generated by an AI system relative to its cost. Strong AI ROI requires defining success metrics before development begins, measuring them rigorously after deployment, and attributing results accurately.
Real-world example: A predictive maintenance model that costs $200K to build but prevents $1.8M in annual downtime delivers a 900% first-year ROI — a calculation that should be modelled before any project begins.
A type of machine learning where a model is trained on labelled data — examples where the correct answer is known. The model learns to map inputs to outputs by studying these examples. The most common ML paradigm in commercial applications.
An NLP technique that automatically identifies and categorises the emotional tone of text — positive, negative, or neutral — and often more granular emotions. Used for brand monitoring, customer feedback analysis, and social media intelligence.
The way LLMs break text into small chunks (tokens) for processing. A token is roughly 3–4 characters or about three-quarters of a word in English. LLM pricing and context limits are measured in tokens rather than words or characters.
The dataset used to teach an AI model how to perform a task. The quality, quantity, and representativeness of training data is the single biggest determinant of model performance. “Garbage in, garbage out” applies directly.
The neural network architecture introduced by Google in 2017 that revolutionised AI. Transformers use attention mechanisms to process data in parallel (rather than sequentially), enabling training on massive datasets. All major LLMs — GPT, Claude, Gemini — are transformers.
Applying knowledge gained from one task to a different but related task. A model pre-trained on millions of medical images can be fine-tuned for a specific disease with far fewer examples — dramatically reducing data and compute requirements.
Machine learning on unlabelled data — the model finds its own structure and patterns without being told what to look for. Clustering, dimensionality reduction, and anomaly detection are common unsupervised learning tasks.
A database designed to store and search vector embeddings — numerical representations of text, images, or other data that capture semantic meaning. Vector databases are the backbone of RAG systems and semantic search applications.
Real-world example: A legal AI system stores all contracts as vector embeddings. When a lawyer asks “show me all clauses about IP ownership in US contracts,” the vector database finds semantically relevant clauses even if they don’t use those exact words.
Zero-shot: an LLM performs a task from a description alone, with no examples. Few-shot: the model is given a small number of examples in the prompt before being asked to perform the task. Both leverage the broad knowledge baked into foundation models during pre-training.
Real-world example: A few-shot prompt shows the model three examples of correctly classified customer complaints, then asks it to classify the next 10,000 — with no additional training or fine-tuning required.
Ready to Apply These Concepts
to Your Business?
Understanding AI vocabulary is the first step. The second is a free conversation with our team about which technologies actually apply to your specific challenges — with honest advice on where AI will and won’t help.