QuIDE — unified metric for quantized neural network efficiency
arXiv paper proposes a single score to evaluate quantized model trade-offs: Intelligence Index I = (C × P) / log₂(T+1), collapsing compression, accuracy, and latency into one number.
Experiments across SimpleCNN, ResNet-18, and Llama-3-8B reveal task-dependent optima: 4-bit for MNIST and large LLMs, 8-bit for complex CNNs. An accuracy-gated variant flags configurations where 4-bit post-training quantization collapses accuracy.