Glossary
Key terms in LLMOps, AI pipelines, and production AI governance.
AI Agent Orchestration
AI agent orchestration is the coordination of multiple AI agents working together on a task, managing their communication, task delegation, and output synthesis.
AI Governance
AI governance is the framework of policies, processes, and technical controls that ensure AI systems operate safely, ethically, and in compliance with regulations.
AI Pipeline
An AI pipeline is a sequence of connected processing steps that transforms inputs into AI-generated outputs, including data retrieval, model inference, post-processing, and quality evaluation.
Eval Gate
An eval gate is an automated quality checkpoint that runs evaluation suites against an AI pipeline and blocks deployment if quality thresholds are not met.
LLM Evaluation
LLM evaluation is the systematic process of measuring language model output quality across dimensions like accuracy, faithfulness, relevance, and safety.
LLM Observability
LLM observability is the practice of collecting, analyzing, and visualizing traces, metrics, and logs from language model applications to understand system behavior in production.
LLM Tracing
LLM tracing is the practice of recording the full execution path of a language model request, including prompt construction, model calls, tool use, and response generation, as a structured trace.
LLM-as-a-Judge
LLM-as-a-Judge is an evaluation pattern where a language model scores or ranks the outputs of another language model against defined criteria.
LLMOps
LLMOps is the set of practices, tools, and infrastructure for deploying, monitoring, evaluating, and governing large language models in production.
Prompt Management
Prompt management is the practice of versioning, testing, and deploying prompts as first-class software artifacts with change tracking and rollback capabilities.
Proof Bundle
A proof bundle is an immutable record that packages evaluation results, approval decisions, and deployment metadata into a single auditable artifact for AI pipeline governance.
RAG Evaluation
RAG evaluation measures the quality of retrieval-augmented generation systems across retrieval accuracy, context relevance, answer faithfulness, and end-to-end response quality.