by Anthropic

Claude Haiku 4.5.

text closed 200K ctx Transformer Quality 68.1 Elo 1287

Cheapest input

$1.0/M

on Anthropic API

Cheapest output

$5.0/M

on Anthropic API

Fastest

39 tok/s

on OpenRouter

Hosted equiv.

~$1.8/hr

@ 100 tok/s on Anthropic API

Fast, cheap Claude variant for high-throughput inference.

Hosted API only

No self-host path — closed weights.

Claude Haiku 4.5's weights aren't published. Use it via the access providers below.

Where to use it

Cheapest hosted endpoints.

Provider	Access	$/M in	$/M out
Anthropic API	api direct	$1.0	$5.0	Launch ↗
DeepInfra	hosted inference	$1.0	$5.0	Launch ↗
OpenRouter	api aggregator	$1.0	$5.0	Launch ↗

Capability snapshot Full benchmarks →

What it's best at.

Coding 83.0

General knowledge 78.0

Graduate-level science 42.0

Performance

Speed across providers.

Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.

Provider	Tokens/sec	TTFT	Total
OpenRouter	39.5	3388 ms	4688 ms

Sources

Official references.

Launch post ↗ Demo / Chat ↗ Docs ↗

Family

Variants in the Claude family.

Claude Opus 4.7

closed proprietary

Frontier reasoning and long-form coding from Anthropic.

Claude Sonnet 4.6

closed proprietary

Best price-performance from Anthropic. Default for production agents.

Claude 3.5 Sonnet

closed proprietary

Anthropic's 3.5 generation — still in active production.

Claude 3.5 Haiku

closed proprietary

Fast/cheap Claude 3.5 variant — production fallback for Haiku 4.5.

Best for

Workloads.

Run LLMs (inference / serving) Speech and audio AI

FAQ

Frequently asked.

How do I run Claude Haiku 4.5?

Claude Haiku 4.5 is a closed-source API model. The cheapest way to access it is through the API providers listed on this page (direct API, aggregators, and hosted chat UIs).

Where can I access Claude Haiku 4.5?

Claude Haiku 4.5 is available via Anthropic API, DeepInfra, OpenRouter. Each access option lists its own pricing (per million tokens or hourly hosting).

How much does it cost to run Claude Haiku 4.5?

API pricing starts at $1.0/M input tokens and $5.0/M output tokens. Self-hosting cost depends on the GPU you rent — see the Run It Yourself tab.

Is Claude Haiku 4.5 open-source or proprietary?

Claude Haiku 4.5 is a proprietary model from Anthropic. Access is API-only — there are no public weights to download.

API pricing

What it costs per month across providers.

Estimate your monthly bill for Claude Haiku 4.5 across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.

Cheapest

$20.0

Anthropic API

$/M input

$1.0

per million tokens

$/M output

$5.0

per million tokens

Providers

with priced rows

Monthly bill

Cheapest provider on the left.

Total monthly cost — input + output tokens combined.

Per provider

Bill breakdown.

Provider	$/M in	$/M out	Input cost	Output cost	Monthly total
Anthropic API	$1.0	$5.0	$10.0	$10.0	$20.0	Sign up ↗
OpenRouter	$1.0	$5.0	$10.0	$10.0	$20.0	Sign up ↗
DeepInfra	$1.0	$5.0	$10.0	$10.0	$20.0	Sign up ↗

Full calculator

Want to compare token volumes across other models too? Open the standalone API pricing tool →

Capability snapshot

What it's best at.

Coding 83.0

General knowledge 78.0

Graduate-level science 42.0

Scores normalised against benchmark ceilings (100 = perfect). Coloured by tier — coral 80+ frontier, lavender 65+ strong, sage 50+ solid, slate below.

Benchmarks

Published scores.

Benchmark	Score	Source
GPQA	42.0	official ↗
MMLU	78.0	official ↗
HumanEval	83.0	official ↗

Leaderboard standing

Independent rankings.

LMSYS Chatbot Arena

1287

Elo from blind head-to-head votes

View leaderboard ↗

Artificial Analysis Quality Index

65.0

Composite of reasoning + coding + tool-use benchmarks

View on Artificial Analysis ↗

Description

About Claude Haiku 4.5.

Claude Haiku 4.5 is Anthropic's fastest and cheapest model — built for high-throughput inference workloads (chat bots, classification, summarisation, autocomplete) where latency and cost dominate. Quality is roughly two tiers below Sonnet but still strong on shorter prompts. Same 200K context as the rest of Claude 4.x, same vision support, same tool-use API. Bedrock + Vertex AI mirror the model under enterprise contracts. Most teams use Haiku for the cost-sensitive layer of a pipeline (e.g. routing prompts to Sonnet only when needed).

Architecture

How it's built.

Architecture

Transformer

Knowledge cutoff

Apr 2025

183 days from cutoff to release.

Context window

How much it can remember.

200K tokens ≈ 150,000 English words

4K 32K 128K 1M

Max output per call: 8K tokens

Capabilities

What it can do.

✓ Vision input

· Audio input

· Video input

✓ Function calling

✓ Tool use

✓ JSON mode

✓ Streaming

· Fine-tuning