GPT-4o.
OpenAI's multimodal model — text, vision, audio in one.
No self-host path — closed weights.
GPT-4o's weights aren't published. Use it via the access providers below.
Cheapest hosted endpoints.
What it's best at.
Speed across providers.
Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.
| Provider | Tokens/sec | TTFT | Total |
|---|---|---|---|
| OpenRouter | 56.7 | 764 ms | 2151 ms |
Variants in the GPT family.
Workloads.
Frequently asked.
How do I run GPT-4o?
Where can I access GPT-4o?
How much does it cost to run GPT-4o?
Is GPT-4o open-source or proprietary?
What it costs per month across providers.
Estimate your monthly bill for GPT-4o across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.
Cheapest provider on the left.
Total monthly cost — input + output tokens combined.
Bill breakdown.
What it's best at.
Scores normalised against benchmark ceilings (100 = perfect). Coloured by tier — coral 80+ frontier, lavender 65+ strong, sage 50+ solid, slate below.
Published scores.
| Benchmark | Score | Source |
|---|---|---|
| GPQA | 53.6 | official ↗ |
| MATH | 76.6 | official ↗ |
| MMLU | 88.7 | official ↗ |
| MMMU | 69.1 | official ↗ |
| HumanEval | 90.2 | official ↗ |
Independent rankings.
About GPT-4o.
GPT-4o (omni) is OpenAI's first model trained natively on text, vision, and audio in a single pass, released May 2024. It enables real-time voice conversations with sub-300ms latency and native image understanding without a separate vision encoder. Largely superseded by GPT-5 for new builds, but still widely deployed in production because the API surface and prompt patterns are well-understood. Cheaper than GPT-5 input-side; comparable on output. 128K context. Most often used today for voice-mode applications where its native audio support beats GPT-5's API-only voice.