ai
INT4
4-bit Integer Quantization
Definition
INT4 quantization represents model weights using 4-bit integers instead of 16- or 32-bit floats, reducing model size by 4–8x and significantly increasing inference speed on compatible hardware. INT4 models introduce quantization error that can degrade accuracy on complex reasoning tasks, but modern methods like GPTQ and AWQ minimize this degradation.
INT4 is used to run 70B+ parameter models on single consumer GPUs.
Ship secure code faster
Crash Override integrates security into the developer workflow. No context switching, no waiting on reviews.