Glossary
RAG (Retrieval-Augmented Generation)
AI pattern where an LLM generates answers from documents retrieved at query time, rather than from training data alone.
Definition
Retrieval-Augmented Generation is the dominant production pattern for LLM applications that need to answer questions over private or up-to-date content. The system first retrieves relevant chunks from a document store (often using embeddings and vector search, sometimes combined with keyword retrieval), then passes those chunks to the LLM as context so the generated answer is grounded in the source material. RAG sidesteps the cost and risk of fine-tuning while keeping answers current and citable.
Why it matters
RAG is what makes 'AI over our docs' actually work in production. Done well, it dramatically reduces hallucinations and lets you cite sources. Done badly, it returns confident answers from irrelevant chunks. Evaluation harnesses are mandatory.
See also
LLM (Large Language Model)
A neural-network model trained on large text corpora to generate, summarise, classify and reason over text and code.
Read →Vector Database
A database optimised for storing and querying high-dimensional vectors (embeddings) by similarity.
Read →Embeddings
Dense numerical vector representations of text, images or audio that capture semantic similarity.
Read →Working on RAG (Retrieval-Augmented Generation)?
Our AI Developmentteam ships this in production. Tell us your scope and we'll share a written recommendation and fixed quote within 48 hours.
AI Development →