Ran the same eval suite across three quantizations of Llama-3.1-8B on M2 Pro. | quant | mem | tok/s | HumanEval | MMLU-redux | |-------|------|-------|-----------|------------| | q2 | 2.7G | 48 | 44.1 | 57.2 | | q4 | 4.…