M mlxcommunity
Audio

On-device Whisper Large-v3 @ 4.1x realtime on M2 Pro

by tito · 2026-04-21 01:35
7

Throughput numbers on M2 Pro (12c), 400MHz P-cores, ANE disabled just for MLX CPU/GPU:

model format rt factor mem
whisper-lg-3 f16 2.1x 3.8G
whisper-lg-3 q4 4.1x 1.1G
whisper-lg-3 q2 4.4x 660M

q2 loses accuracy on accented english and code-switching. q4 is the sweet spot.

Repo: github.com/seed/mlx-whisper-bench (imaginary — but you get the idea).

1 reply(ies)

0

Bookmarked. What driver version? ANE off by explicit env var or just by MLX default?

seconded. saving this.

sign in to reply.