Claude Sonnet 4.6.
Best price-performance from Anthropic. Default for production agents.
No self-host path — closed weights.
Claude Sonnet 4.6's weights aren't published. Use it via the access providers below.
Cheapest hosted endpoints.
What it's best at.
Speed across providers.
Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.
| Provider | Tokens/sec | TTFT | Total |
|---|---|---|---|
| OpenRouter | 37.3 | 1634 ms | 4420 ms |
Variants in the Claude family.
Frontier reasoning and long-form coding from Anthropic.
Anthropic's 3.5 generation — still in active production.
Fast, cheap Claude variant for high-throughput inference.
Fast/cheap Claude 3.5 variant — production fallback for Haiku 4.5.
Frequently asked.
How do I run Claude Sonnet 4.6?
Where can I access Claude Sonnet 4.6?
How much does it cost to run Claude Sonnet 4.6?
Is Claude Sonnet 4.6 open-source or proprietary?
What it costs per month across providers.
Estimate your monthly bill for Claude Sonnet 4.6 across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.
Cheapest provider on the left.
Total monthly cost — input + output tokens combined.
Bill breakdown.
What it's best at.
Scores normalised against benchmark ceilings (100 = perfect). Coloured by tier — coral 80+ frontier, lavender 65+ strong, sage 50+ solid, slate below.
Published scores.
| Benchmark | Score | Source |
|---|---|---|
| GPQA | 59.0 | official ↗ |
| MATH | 82.0 | official ↗ |
| MMLU | 86.0 | official ↗ |
| MMLU-Pro | 78.0 | official ↗ |
| HumanEval | 91.5 | official ↗ |
| SWE-bench | 77.2 | official ↗ |
Independent rankings.
About Claude Sonnet 4.6.
Claude Sonnet 4.6 is Anthropic's price-performance flagship — fast enough for production traffic, smart enough to handle most agentic tasks Opus could. Released alongside Claude Code as the default model for software-engineering agents, Sonnet 4.6 leads the SWE-bench Verified leaderboard and powers most third-party coding tools. The 4.6 release brought the 1M-token context window option for long-document workflows (e.g. analysing entire codebases or legal contracts in one pass). Supports vision, tool use, function calling, and JSON mode. Cheaper than GPT-5 input-side; comparable output-side.
How it's built.
How much it can remember.
What it can do.
Every place this model is hosted.
Anthropic API
api directPoe by Quora
chat uiSubscription includes Claude usage
AWS Bedrock
api directAnthropic models on AWS Bedrock — same pricing as Anthropic direct + AWS infrastructure costs (storage / data transfer). Requires AWS account + Bedrock model access request.
Google Vertex AI
api directAnthropic models on Google Vertex AI — same pricing as Anthropic direct, billed via GCP. Requires Google Cloud project + Vertex AI Model Garden access.