by OpenAI

GPT-4o Mini.

multimodal closed 128K ctx Quality 85.3
Cheapest input
$0.15/M
on OpenAI API
Cheapest output
$0.6/M
on OpenAI API
Fastest
35 tok/s
on OpenRouter
Hosted equiv.
~$0.22/hr
@ 100 tok/s on OpenAI API

Cheap multimodal default — replaced GPT-3.5 Turbo for low-cost workloads.

Hosted API only

No self-host path — closed weights.

GPT-4o Mini's weights aren't published. Use it via the access providers below.

Where to use it

Cheapest hosted endpoints.

Provider Access $/M in $/M out
OpenAI API api direct $0.15 $0.6 Launch ↗
Azure OpenAI Service api direct $0.15 $0.6 Launch ↗
OpenRouter api aggregator $0.15 $0.6 Launch ↗
Capability snapshot Full benchmarks →

What it's best at.

Coding 87.2
General knowledge 82.0
Performance

Speed across providers.

Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.

Provider Tokens/sec TTFT Total
OpenRouter 34.8 1446 ms 3588 ms
Sources

Official references.

Best for

Workloads.

FAQ

Frequently asked.

How do I run GPT-4o Mini?
GPT-4o Mini is a closed-source API model. The cheapest way to access it is through the API providers listed on this page (direct API, aggregators, and hosted chat UIs).
Where can I access GPT-4o Mini?
GPT-4o Mini is available via OpenAI API, Azure OpenAI Service, OpenRouter. Each access option lists its own pricing (per million tokens or hourly hosting).
How much does it cost to run GPT-4o Mini?
API pricing starts at $0.15/M input tokens and $0.6/M output tokens. Self-hosting cost depends on the GPU you rent — see the Run It Yourself tab.
Is GPT-4o Mini open-source or proprietary?
GPT-4o Mini is a proprietary model from OpenAI. Access is API-only — there are no public weights to download.