Claude Haiku 4.5.
Fast, cheap Claude variant for high-throughput inference.
No self-host path — closed weights.
Claude Haiku 4.5's weights aren't published. Use it via the access providers below.
Cheapest hosted endpoints.
What it's best at.
Speed across providers.
Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.
| Provider | Tokens/sec | TTFT | Total |
|---|---|---|---|
| OpenRouter | 39.5 | 3388 ms | 4688 ms |
Variants in the Claude family.
Frontier reasoning and long-form coding from Anthropic.
Best price-performance from Anthropic. Default for production agents.
Anthropic's 3.5 generation — still in active production.
Fast/cheap Claude 3.5 variant — production fallback for Haiku 4.5.
Frequently asked.
How do I run Claude Haiku 4.5?
Where can I access Claude Haiku 4.5?
How much does it cost to run Claude Haiku 4.5?
Is Claude Haiku 4.5 open-source or proprietary?
What it costs per month across providers.
Estimate your monthly bill for Claude Haiku 4.5 across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.
Cheapest provider on the left.
Total monthly cost — input + output tokens combined.
Bill breakdown.
What it's best at.
Scores normalised against benchmark ceilings (100 = perfect). Coloured by tier — coral 80+ frontier, lavender 65+ strong, sage 50+ solid, slate below.
Published scores.
| Benchmark | Score | Source |
|---|---|---|
| GPQA | 42.0 | official ↗ |
| MMLU | 78.0 | official ↗ |
| HumanEval | 83.0 | official ↗ |
Independent rankings.
About Claude Haiku 4.5.
Claude Haiku 4.5 is Anthropic's fastest and cheapest model — built for high-throughput inference workloads (chat bots, classification, summarisation, autocomplete) where latency and cost dominate. Quality is roughly two tiers below Sonnet but still strong on shorter prompts. Same 200K context as the rest of Claude 4.x, same vision support, same tool-use API. Bedrock + Vertex AI mirror the model under enterprise contracts. Most teams use Haiku for the cost-sensitive layer of a pipeline (e.g. routing prompts to Sonnet only when needed).