MMMU — multimodal reasoning across images.
Best AI models for vision.
MMMU evaluates models on college-level questions paired with diagrams, charts, and images. Sourced from each model's official MMMU submission.
Benchmarks used:
MMMU
| # | Model | Score | From |
|---|---|---|---|
| 1 | 81.7 | Google DeepMind | |
| 2 | 69.1 | OpenAI | |
| 3 | 64.9 | Google DeepMind | |
| 4 | 50.7 | Meta AI |
Showing top 4 models with published data on at least one of the benchmarks above. Scores are weighted averages on a 0–100 scale.