AI Engineering Glossary
Search

Mistral

term

An AI company developing efficient and powerful language models for various applications. Their models are known for achieving strong performance while maintaining smaller parameter counts compared to other frontier models.

Search Perplexity |Ask ChatGPT |Ask Clade

a

Advanced RAGAgentAgentic AIAgentsAI-as-a-JudgeAI EngineeringAI JudgesAI OrchestrationAlignmentAlignmentApplications of Foundation ModelsAutomatic1111Autoregressive language models

b

BenchmarksBM25

c

Causal language modelsChain-of-thought (CoT)Chain of Thought PromptingChainsChatGPTChromaChunkingChunking ModuleChunking StrategiesCivitaiClaudeCleaning LayerCohereConstraint SamplingContextContext WIndowContinuous batchingCopyright RegurgitationCopyright RegurgitationCost and LatencyCross entropy

d

Data CategoryData Collection PipelineData ContaminationData Extraction ModuleData Manipulation in Foundation ModelsData parallelismData Quality ConcernsData SlicingData SynthesisData TheftDataset EngineeringDeterministicDispatcherDomain-Specific ModelsDomain-specific models

e

Eleven LabsEmbedding-based retrievalEmbedding ComponentEmbedding HandlerEmbeddingsEvaluation HarnessEvaluation PipelineExLlamaV2EXL2

f

Factual ConsistencyFaissFeatureFeature PipelineFew-shot learningFew-shot learningFinetuningFinetuningFinetuning for Structured OutputsFlash attentionFoundation ModelsFrontier ModelFunction calling

g

GeminiGenerative AIGGUFGGUF FormatGoogle AI StudioGPTQGradient DescentGreedy samplingGreedy SamplingGroundingGwen

h

HallucinationHandlerHugging FaceHybrid SearchHypothetical Document Embeddings

i

In-context learningInferenceInference OptimizationInstruction FollowingInstruction-Following CapabilityInverse Document FrequencyInverted index

j

JSON mode

k

k-nearest neighborsKaggleKV cache

l

Large Language ModelLatent SpaceLexical SimilarityLlamaLLM EngineerLLM TwinLLM Twin ArchitectureLLM Twin GoalLoading ModuleLogical Feature StoreLogitsLogitsLogprobsLuma Labs

m

Masked language modelsMemory SystemMethods for Structured OutputsMidjourneyMistralMixture-of-experts (MoE)ML Engineering / MLOpsModel AdaptationModel API ConsModel API ProsModel Build vs. BuyModel Context ProtocolModel Development ProcessModel LicensesModel Optimization StrategiesModel parallelismModel QuantizationMultilingual ModelsMultimodal Embedding Models

n

N-gramsNatural Language Generation (NLG)NoSQL DatabaseNotebook LM

o

OllamaOne Shot PromptingOpen-ended OutputsOpen Source ModelsOpen Weight ModelsOVM

p

PagedAttentionParameterPerplexityPerplexityPerplexity AIPineconePipeline parallelismPost-ProcessingPost-retrievalPost-trainingPost-Training Quantization (PTQ)Pre-retrievalPre-retrievalPre-trained modelPreference ModelsProbabilisticPrompt EngineeringPrompt EngineeringPrompt EngineeringPrompt InjectionPrompt OrganizationPrompt SecurityPrompting GuideProprietary Data

q

QuantizationQuantization-Aware Training

r

RAGRAG ApplicationsRAG Ingestion PipelineRedundancy RemovalReinforcement Learning from Human FeedbackReinforcement Learning from Human FeedbackRerankingRetrieval AlgorithmsRetrieval-Augmented GenerationRetrieval Augmented GenerationRetrieval PipelineReverse Prompt EngineeringReward ModelRoleplayingRunDiffusionRunway ML

s

SAFESamplingSelf-critique promptingSelf-Hosting ConsSelf-Hosting ProsSelf-querySelf-supervisionSemantic SimilaritySpeculative decodingStability.aiStable DiffusionStable Diffusion ArtStochastic ParrotStructured OutputsStructured OutputsSub-queriesSupervised FinetuningSystem Prompt

t

TemperatureTensor ParallelismTerm-based retrievalTerm FrequencyTerm Frequency-Inverse Document FrequencyTest Time SamplingText Generation InferenceText-to-SQLTGITime to first tokenTokenTokenTokenizationTokenizationTop-k samplingTop-p samplingToxicity BenchmarksTraining Data ExtractionTraining Data Sources

u

Udio

v

vLLMVector DatabaseVector DatabaseVector databaseVocabulary

w

Weight quantization

z

ZenML Pipelinezenml.ioZero Shot Prompting