Two vendors quote you wildly different prices to run "the same" open model. One runs it on your laptop; the other needs a server rack. Often the difference isn't the model at all — it's quantization. It's one of the few technical concepts worth understanding because it directly shapes what AI costs you.
What it actually is
A model stores its "knowledge" as millions of numbers. By default each number is stored at high precision, which is accurate but memory-hungry. Quantization rounds those numbers down to a coarser format — think storing prices as whole dollars instead of dollars and cents. The model gets dramatically smaller and faster to run, while staying close to its original quality.
Why you should care about a rounding trick
Smaller models mean cheaper inference, lower latency, and the ability to run AI on hardware you already own instead of renting expensive cloud GPUs. A quantized model can be the difference between AI that costs you pennies per task and AI that costs you dollars. It's also what makes on-device AI — running a capable model on a phone or laptop with no internet — practical at all.
The tradeoff is real but usually small
Rounding loses information. Aggressively quantized models can get sloppier on hard reasoning, edge cases, and tasks that need precision. For drafting an email or summarizing a document, you likely won't notice. For complex analysis where a subtle error is costly, you might. The right question isn't "is quantization good or bad" — it's "is this level of quality good enough for this specific job."
The honest caveat
You will rarely choose a quantization level yourself; your vendor or tool does it for you. The point of knowing the term is to ask the right question when prices or performance look off: "What quantization are you running, and what did it cost in accuracy?" A vendor who can't answer that clearly is a flag.
Your next step
If you're evaluating any tool that runs an open model, ask that one question. The answer tells you whether the price reflects a smart engineering choice or a quality corner you'll regret later.