Post-Training Quantization
Post-Training Quantization (PTQ) is the process of converting a trained neural network from high-precision calculations, typically using 32-bit floats, to more efficient, lower-precision computations, such as 8-bit integers. This maintains the model's performance while improving execution speed and reducing its size, making it suitable for deployment on hardware with limited resources such as embedded devices. It differs from Training with Quantization, where quantization is integrated into the training process itself.