#ai #localllm #selfhostedai #quantization
7+ main precision formats used in AI
βͺοΈ FP32
βͺοΈ FP16
βͺοΈ BF16
βͺοΈ FP8 (E4M3 / E5M2)
βͺοΈ FP4
βͺοΈ INT8/INT4
βͺοΈ 2-bit (ternary/binary quantization)
General trend: higher precision for training, lower precision for inference.
Save the list and learn more about these formats here: huggingface.co/posts/Kseniaseβ¦
