AI Engineering Glossary
Search

Gradient Descent

term

An optimization algorithm used to train machine learning models by iteratively adjusting parameters to minimize error. The algorithm works by calculating the gradient of the loss function and updating parameters in the direction that reduces the error.

Search Perplexity |Ask ChatGPT |Ask Clade

a

Advanced RAGAgentAgentic AIAgentsAI-as-a-JudgeAI EngineeringAI JudgesAI OrchestrationAlignmentAlignmentApplications of Foundation ModelsAutomatic1111Autoregressive language models

b

BenchmarksBM25

c

Causal language modelsChain-of-thought (CoT)Chain of Thought PromptingChainsChatGPTChromaChunkingChunking ModuleChunking StrategiesCivitaiClaudeCleaning LayerCohereConstraint SamplingContextContext WIndowContinuous batchingCopyright RegurgitationCopyright RegurgitationCost and LatencyCross entropy

d

Data CategoryData Collection PipelineData ContaminationData Extraction ModuleData Manipulation in Foundation ModelsData parallelismData Quality ConcernsData SlicingData SynthesisData TheftDataset EngineeringDeterministicDispatcherDomain-Specific ModelsDomain-specific models

e

Eleven LabsEmbedding-based retrievalEmbedding ComponentEmbedding HandlerEmbeddingsEvaluation HarnessEvaluation PipelineExLlamaV2EXL2

f

Factual ConsistencyFaissFeatureFeature PipelineFew-shot learningFew-shot learningFinetuningFinetuningFinetuning for Structured OutputsFlash attentionFoundation ModelsFrontier ModelFunction calling

g

GeminiGenerative AIGGUFGGUF FormatGoogle AI StudioGPTQGradient DescentGreedy samplingGreedy SamplingGroundingGwen

h

HallucinationHandlerHugging FaceHybrid SearchHypothetical Document Embeddings

i

In-context learningInferenceInference OptimizationInstruction FollowingInstruction-Following CapabilityInverse Document FrequencyInverted index

j

JSON mode

k

k-nearest neighborsKaggleKV cache

l

Large Language ModelLatent SpaceLexical SimilarityLlamaLLM EngineerLLM TwinLLM Twin ArchitectureLLM Twin GoalLoading ModuleLogical Feature StoreLogitsLogitsLogprobsLuma Labs

m

Masked language modelsMemory SystemMethods for Structured OutputsMidjourneyMistralMixture-of-experts (MoE)ML Engineering / MLOpsModel AdaptationModel API ConsModel API ProsModel Build vs. BuyModel Context ProtocolModel Development ProcessModel LicensesModel Optimization StrategiesModel parallelismModel QuantizationMultilingual ModelsMultimodal Embedding Models

n

N-gramsNatural Language Generation (NLG)NoSQL DatabaseNotebook LM

o

OllamaOne Shot PromptingOpen-ended OutputsOpen Source ModelsOpen Weight ModelsOVM

p

PagedAttentionParameterPerplexityPerplexityPerplexity AIPineconePipeline parallelismPost-ProcessingPost-retrievalPost-trainingPost-Training Quantization (PTQ)Pre-retrievalPre-retrievalPre-trained modelPreference ModelsProbabilisticPrompt EngineeringPrompt EngineeringPrompt EngineeringPrompt InjectionPrompt OrganizationPrompt SecurityPrompting GuideProprietary Data

q

QuantizationQuantization-Aware Training

r

RAGRAG ApplicationsRAG Ingestion PipelineRedundancy RemovalReinforcement Learning from Human FeedbackReinforcement Learning from Human FeedbackRerankingRetrieval AlgorithmsRetrieval-Augmented GenerationRetrieval Augmented GenerationRetrieval PipelineReverse Prompt EngineeringReward ModelRoleplayingRunDiffusionRunway ML

s

SAFESamplingSelf-critique promptingSelf-Hosting ConsSelf-Hosting ProsSelf-querySelf-supervisionSemantic SimilaritySpeculative decodingStability.aiStable DiffusionStable Diffusion ArtStochastic ParrotStructured OutputsStructured OutputsSub-queriesSupervised FinetuningSystem Prompt

t

TemperatureTensor ParallelismTerm-based retrievalTerm FrequencyTerm Frequency-Inverse Document FrequencyTest Time SamplingText Generation InferenceText-to-SQLTGITime to first tokenTokenTokenTokenizationTokenizationTop-k samplingTop-p samplingToxicity BenchmarksTraining Data ExtractionTraining Data Sources

u

Udio

v

vLLMVector DatabaseVector DatabaseVector databaseVocabulary

w

Weight quantization

z

ZenML Pipelinezenml.ioZero Shot Prompting