Decoder-Only Architecture
Decoder-only architecture is a design in sequence processing tasks where the model predicts the next element in a sequence by relying only on previously generated elements. Unlike architectures with separate encoder and decoder parts (often used in tasks like translation), decoder-only models, such as GPT, are adept at autoregression. They are useful in tasks where generation, continuation, or transformation of sequences based solely on historical context is required.