API aggregators
OpenAI-compatible
US
OpenRouter.
Unified API in front of every major frontier API + open-weight host. Single key, one billing source, transparent per-model markup.
Price × speed
Where the sweet spot lives.
Each dot = one model on OpenRouter we've measured. Cheaper to the left, faster toward the top.
Loading...
Cheapest 12 models
Where the floor is.
Sorted cheapest-first by $/M input. Useful when you're looking for the floor before picking a model.
Loading...
At a glance
- Service type
- API aggregators
- Trust tier
- Tier 1
- Headquarters
- US
- OpenAI-compat
- Yes
- Open weights
- Yes
- Proprietary
- Yes
When to pick OpenRouter
Best for
- Building once and swapping models freely — same key, same endpoint shape.
- Workloads that benefit from automatic failover across upstreams.
- Anyone who wants per-token billing without managing N separate accounts.
Avoid for
- Workloads needing the absolute lowest per-token price (first-party usually wins).
- Anything requiring real-time price quotes from the original maker.
Models on OpenRouter
Pricing + measured speed + self-host alternative, one row per model. Click a column header to sort.
| Model ↕ | Maker ↕ | Access ↕ | $/M in ↕ | $/M out ↕ | Tokens/sec ↕ | TTFT ↕ | Self-host on ↕ | |
|---|---|---|---|---|---|---|---|---|
| Grok 3 | xAI | api aggregator | — | — | 111.5 | 5086 ms | API only | Open → |
| DeepSeek V3 | DeepSeek | api aggregator | — | — | 30.3 | 1207 ms | 2× AMD MI325 · INT4 | Open → |
| Command R+ | Cohere | api aggregator | — | — | 31.9 | 855 ms | 1× Nvidia H100 · INT4 | Open → |
| Claude 3.5 Sonnet | Anthropic | api aggregator | — | — | 32.6 | 1376 ms | API only | Open → |
| GPT-OSS 20B | OpenAI | api aggregator | $0.03 | $0.14 | 125.5 | — | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| Gemini 2.5 Pro | Google DeepMind | api aggregator | $1.25 | $10.0 | 81.2 | 5911 ms | API only | Open → |
| Gemma 3 4B | Google DeepMind | api aggregator | $0.04 | $0.08 | 20.4 | 920 ms | 1× Nvidia Titan V · INT4 | Open → |
| Gemma 3 27B | Google DeepMind | api aggregator | $0.08 | $0.16 | 44.9 | 782 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Claude 3.5 Haiku | Anthropic | api aggregator | $0.8 | $4.0 | 28.8 | 1146 ms | API only | Open → |
| Llama 3.2 3B | Meta AI | api aggregator | $0.0509 | $0.335 | 107.9 | 439 ms | 1× Nvidia Titan V · INT4 | Open → |
| Llama 3.1 70B | Meta AI | api aggregator | $0.4 | $0.4 | 36.6 | 522 ms | 1× Nvidia L40S · INT4 | Open → |
| Llama 3.1 8B | Meta AI | api aggregator | $0.02 | $0.05 | 50.0 | 738 ms | 1× Nvidia P102-100 · INT4 | Open → |
| GPT-4o | OpenAI | api aggregator | $2.5 | $10.0 | 56.7 | 764 ms | API only | Open → |
| Qwen: Qwen3.5-Flash | Alibaba (Qwen Team) | api aggregator | $0.065 | $0.26 | 137.9 | 17893 ms | API only | Open → |
| LiquidAI: LFM2-24B-A2B | Liquid | api aggregator | $0.03 | $0.12 | 77.7 | 692 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| Google: Gemini 3.1 Pro Preview Custom Tools | Google DeepMind | api aggregator | $2.0 | $12.0 | 96.9 | 5101 ms | API only | Open → |
| OpenAI: GPT-5.3-Codex | OpenAI | api aggregator | $1.75 | $14.0 | 55.8 | 8122 ms | API only | Open → |
| AionLabs: Aion-2.0 | Aion Labs | api aggregator | $0.8 | $1.6 | 21.7 | 24181 ms | API only | Open → |
| Google: Gemini 3.1 Pro Preview | Google DeepMind | api aggregator | $2.0 | $12.0 | 88.2 | 5298 ms | API only | Open → |
| Qwen: Qwen3.5 Plus 2026-02-15 | Alibaba (Qwen Team) | api aggregator | $0.26 | $1.56 | 54.6 | 62172 ms | API only | Open → |
| Qwen: Qwen3.5 397B A17B | Alibaba (Qwen Team) | api aggregator | $0.39 | $2.34 | 86.1 | — | 1× AMD MI325 · INT4 | Open → |
| MiniMax: MiniMax M2.5 | MiniMax | api aggregator | $0.15 | $1.15 | 80.7 | — | API only | Open → |
| GLM-5 | Zhipu AI | api aggregator | $0.6 | $1.92 | 43.0 | — | 1× AMD MI325 · INT4 | Open → |
| Qwen: Qwen3 Max Thinking | Alibaba (Qwen Team) | api aggregator | $0.78 | $3.9 | 26.6 | 953 ms | API only | Open → |
| Anthropic: Claude Opus 4.6 | Anthropic | api aggregator | $5.0 | $25.0 | 20.8 | 2196 ms | API only | Open → |
| GLM-4.5-Air | Zhipu AI | api aggregator | $0.13 | $0.85 | 19.1 | 1992 ms | 1× Nvidia H100 · INT4 | Open → |
| Kimi K2 | Moonshot AI | api aggregator | $0.57 | $2.3 | 14.5 | 2170 ms | 4× Nvidia H200 · FP8 | Open → |
| DeepSeek R1 Distill Llama 70B | DeepSeek | api aggregator | $0.7 | $0.8 | 47.0 | 6622 ms | 1× Nvidia L40S · INT4 | Open → |
| DeepSeek R1 | DeepSeek | api aggregator | $0.7 | $2.5 | 21.8 | 13392 ms | 2× AMD MI325 · INT4 | Open → |
| Phi-4 | Microsoft | api aggregator | $0.065 | $0.14 | 29.7 | 1987 ms | 1× Nvidia RTX 3080 · INT4 | Open → |
| Qwen 2.5 Coder 32B | Alibaba (Qwen Team) | api aggregator | $0.66 | $1.0 | 30.4 | 777 ms | 1× Nvidia RTX A5000 · INT4 | Open → |
| Llama 3.2 11B Vision | Meta AI | api aggregator | $0.245 | $0.245 | 40.9 | 630 ms | 1× Nvidia GTX 1070 Ti · INT4 | Open → |
| Llama 3.2 1B | Meta AI | api aggregator | $0.027 | $0.201 | 66.0 | 1045 ms | 1× Nvidia Titan V · FP8 | Open → |
| Qwen 2.5 72B | Alibaba (Qwen Team) | api aggregator | $0.36 | $0.4 | 25.9 | 768 ms | 1× Nvidia L40S · INT4 | Open → |
| Hermes 3 70B | Nous Research | api aggregator | $0.3 | $0.3 | 39.5 | 640 ms | 1× Nvidia L40S · INT4 | Open → |
| Mistral Nemo 12B | Mistral AI | api aggregator | $0.02 | $0.03 | 33.5 | 1025 ms | 1× Nvidia GTX 1070 Ti · INT4 | Open → |
| Mixtral 8x22B | Mistral AI | api aggregator | $2.0 | $6.0 | 98.3 | 618 ms | 1× Nvidia H100 NVL · INT4 | Open → |
| Qwen: Qwen3.7 Max | Alibaba (Qwen Team) | api aggregator | $2.5 | $7.5 | 93.1 | 42100 ms | API only | Open → |
| xAI: Grok Build 0.1 | xAI | api aggregator | $1.0 | $2.0 | 123.9 | 24527 ms | API only | Open → |
| Google: Gemini 3.5 Flash | Google DeepMind | api aggregator | $1.5 | $9.0 | 157.7 | 3091 ms | API only | Open → |
| Anthropic: Claude Opus 4.7 (Fast) | Anthropic | api aggregator | $30.0 | $150.0 | 81.4 | 1649 ms | API only | Open → |
| Perceptron: Perceptron Mk1 | Perceptron | api aggregator | $0.15 | $1.5 | 36.0 | 1094 ms | API only | Open → |
| inclusionAI: Ring-2.6-1T | Inclusionai | api aggregator | $0.075 | $0.625 | 98.4 | 3430 ms | API only | Open → |
| Google: Gemini 3.1 Flash Lite | Google DeepMind | api aggregator | $0.25 | $1.5 | 54.7 | 1302 ms | API only | Open → |
| OpenAI: GPT Chat Latest | OpenAI | api aggregator | $5.0 | $30.0 | 35.2 | 712 ms | API only | Open → |
| xAI: Grok 4.3 | xAI | api aggregator | $1.25 | $2.5 | 125.6 | 3215 ms | API only | Open → |
| IBM: Granite 4.1 8B | IBM Research | api aggregator | $0.05 | $0.1 | 87.0 | 445 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Mistral: Mistral Medium 3.5 | Mistral AI | api aggregator | $1.5 | $7.5 | 76.8 | 687 ms | API only | Open → |
| Owl Alpha | Openrouter | api aggregator | — | — | 12.8 | 3301 ms | API only | Open → |
| Anthropic Claude Haiku Latest | Anthropic | api aggregator | $1.0 | $5.0 | 57.9 | 1681 ms | API only | Open → |
| OpenAI GPT Mini Latest | OpenAI | api aggregator | $0.75 | $4.5 | 50.7 | 895 ms | API only | Open → |
| Google Gemini Pro Latest | api aggregator | $2.0 | $12.0 | 71.6 | 6856 ms | API only | Open → | |
| MoonshotAI Kimi Latest | Moonshot AI | api aggregator | $0.73 | $3.49 | 34.9 | — | 4× AMD MI300 · INT4 | Open → |
| Google Gemini Flash Latest | api aggregator | $1.5 | $9.0 | 143.6 | 3351 ms | API only | Open → | |
| Anthropic Claude Sonnet Latest | Anthropic | api aggregator | $3.0 | $15.0 | 35.6 | 1237 ms | API only | Open → |
| OpenAI GPT Latest | OpenAI | api aggregator | $5.0 | $30.0 | 51.9 | 2365 ms | API only | Open → |
| Qwen: Qwen3.5 Plus 2026-04-20 | Alibaba (Qwen Team) | api aggregator | $0.3 | $1.8 | 54.6 | 35689 ms | 1× AMD MI300 · INT4 | Open → |
| Qwen: Qwen3.6 Flash | Alibaba (Qwen Team) | api aggregator | $0.1875 | $1.125 | 202.2 | 10770 ms | API only | Open → |
| Qwen: Qwen3.6 35B A3B | Alibaba (Qwen Team) | api aggregator | $0.15 | $1.0 | 35.2 | — | 1× Nvidia RTX A5000 · INT4 | Open → |
| Qwen: Qwen3.6 Max Preview | Alibaba (Qwen Team) | api aggregator | $1.04 | $6.24 | 38.2 | 51168 ms | API only | Open → |
| Qwen: Qwen3.6 27B | Alibaba (Qwen Team) | api aggregator | $0.3 | $3.2 | 45.8 | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| OpenAI: GPT-5.5 Pro | OpenAI | api aggregator | $30.0 | $180.0 | 7.2 | 69214 ms | API only | Open → |
| OpenAI: GPT-5.5 | OpenAI | api aggregator | $5.0 | $30.0 | 44.3 | 2741 ms | API only | Open → |
| DeepSeek: DeepSeek V4 Pro | DeepSeek | api aggregator | $0.435 | $0.87 | 20.9 | — | API only | Open → |
| DeepSeek: DeepSeek V4 Flash | DeepSeek | api aggregator | $0.1 | $0.2 | 82.1 | — | API only | Open → |
| inclusionAI: Ling-2.6-1T | Inclusionai | api aggregator | $0.075 | $0.625 | 44.9 | 1494 ms | API only | Open → |
| Tencent: Hy3 preview | Tencent | api aggregator | $0.066 | $0.26 | 54.1 | — | API only | Open → |
| Xiaomi: MiMo-V2.5-Pro | Xiaomi | api aggregator | $1.0 | $3.0 | 34.4 | 1875 ms | 1× Nvidia RTX 6000 Ada · INT4 | Open → |
| Xiaomi: MiMo-V2.5 | Xiaomi | api aggregator | $0.4 | $2.0 | 65.5 | 3998 ms | API only | Open → |
| OpenAI: GPT-5.4 Image 2 | OpenAI | api aggregator | $8.0 | $15.0 | 41.6 | 1041 ms | API only | Open → |
| inclusionAI: Ling-2.6-flash | Inclusionai | api aggregator | $0.01 | $0.03 | 71.1 | 857 ms | API only | Open → |
| Anthropic: Claude Opus Latest | Anthropic | api aggregator | $5.0 | $25.0 | 53.8 | 1362 ms | API only | Open → |
| Qwen: Qwen3 Coder Next | Alibaba (Qwen Team) | api aggregator | $0.11 | $0.8 | 80.7 | 1490 ms | 2× AMD MI300 · INT4 | Open → |
| Free Models Router | Openrouter | api aggregator | — | — | 11.0 | 11593 ms | API only | Open → |
| StepFun: Step 3.5 Flash | Stepfun | api aggregator | $0.09 | $0.3 | 107.1 | — | API only | Open → |
| Kimi K2.5 | Moonshot AI | api aggregator | $0.4 | $1.9 | 68.2 | — | 4× AMD MI300 · INT4 | Open → |
| Upstage: Solar Pro 3 | Upstage | api aggregator | $0.15 | $0.6 | — | — | API only | Open → |
| MiniMax: MiniMax M2-her | MiniMax | api aggregator | $0.3 | $1.2 | 28.8 | 2664 ms | API only | Open → |
| Writer: Palmyra X5 | Writer | api aggregator | $0.6 | $6.0 | 46.9 | 651 ms | 1× Nvidia RTX 6000 Ada · INT4 | Open → |
| Z.ai: GLM 4.7 Flash | Zhipu AI | api aggregator | $0.06 | $0.4 | 41.0 | — | API only | Open → |
| OpenAI: GPT-5.2-Codex | OpenAI | api aggregator | $1.75 | $14.0 | 53.4 | 1594 ms | API only | Open → |
| ByteDance Seed: Seed 1.6 Flash | Bytedance Seed | api aggregator | $0.075 | $0.3 | 121.0 | 4526 ms | API only | Open → |
| ByteDance Seed: Seed 1.6 | Bytedance Seed | api aggregator | $0.25 | $2.0 | 81.1 | 9173 ms | 1× Nvidia H200 · INT4 | Open → |
| Google: Gemma 4 26B A4B | Google DeepMind | api aggregator | $0.06 | $0.33 | 51.4 | 864 ms | API only | Open → |
| Pareto Code Router | Openrouter | api aggregator | — | — | 48.6 | — | API only | Open → |
| Baidu: Qianfan-OCR-Fast | Baidu | api aggregator | $0.68 | $2.81 | 49.8 | 518 ms | API only | Open → |
| Kimi K2.6 | Moonshot AI | api aggregator | $0.73 | $3.49 | 18.5 | — | 4× AMD MI300 · INT4 | Open → |
| Claude Opus 4.7 | Anthropic | api aggregator | $5.0 | $25.0 | 37.5 | 1433 ms | API only | Open → |
| Anthropic: Claude Opus 4.6 (Fast) | Anthropic | api aggregator | $30.0 | $150.0 | 74.8 | 1063 ms | API only | Open → |
| GLM-5.1 | Zhipu AI | api aggregator | $0.98 | $3.08 | 171.6 | — | 1× Nvidia RTX 4000 Ada SFF · INT4 | Open → |
| Google: Gemma 4 31B | Google DeepMind | api aggregator | $0.12 | $0.37 | 6.0 | 1477 ms | API only | Open → |
| Qwen: Qwen3.6 Plus | Alibaba (Qwen Team) | api aggregator | $0.325 | $1.95 | 54.7 | 52804 ms | API only | Open → |
| Z.ai: GLM 5V Turbo | Zhipu AI | api aggregator | $1.2 | $4.0 | 59.2 | — | API only | Open → |
| Arcee AI: Trinity Large Thinking | Arcee Ai | api aggregator | $0.22 | $0.85 | 103.8 | — | API only | Open → |
| xAI: Grok 4.20 Multi-Agent | xAI | api aggregator | $2.0 | $6.0 | 427.5 | 5314 ms | API only | Open → |
| xAI: Grok 4.20 | xAI | api aggregator | $1.25 | $2.5 | 54.7 | 776 ms | API only | Open → |
| Google: Lyria 3 Pro Preview | Google DeepMind | api aggregator | — | — | — | — | API only | Open → |
| Google: Lyria 3 Clip Preview | Google DeepMind | api aggregator | — | — | 6.1 | 3383 ms | API only | Open → |
| Kwaipilot: KAT-Coder-Pro V2 | Kwaipilot | api aggregator | $0.3 | $1.2 | 39.8 | 1792 ms | API only | Open → |
| Reka Edge | Rekaai | api aggregator | $0.1 | $0.1 | 24.1 | 3808 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Xiaomi: MiMo-V2-Omni | Xiaomi | api aggregator | $0.4 | $2.0 | 34.5 | 7511 ms | API only | Open → |
| Xiaomi: MiMo-V2-Pro | Xiaomi | api aggregator | $1.0 | $3.0 | 50.0 | 7078 ms | API only | Open → |
| MiniMax: MiniMax M2.7 | MiniMax | api aggregator | $0.279 | $1.2 | 80.5 | — | API only | Open → |
| OpenAI: GPT-5.4 Nano | OpenAI | api aggregator | $0.2 | $1.25 | 69.8 | 767 ms | API only | Open → |
| OpenAI: GPT-5.4 Mini | OpenAI | api aggregator | $0.75 | $4.5 | 63.3 | 882 ms | API only | Open → |
| Mistral: Mistral Small 4 | Mistral AI | api aggregator | $0.15 | $0.6 | 83.7 | 863 ms | API only | Open → |
| GLM-5 Turbo | Zhipu AI | api aggregator | $1.2 | $4.0 | 14.1 | — | API only | Open → |
| NVIDIA: Nemotron 3 Super | Nvidia | api aggregator | $0.09 | $0.45 | 29.4 | — | 1× Nvidia H100 · INT4 | Open → |
| ByteDance Seed: Seed-2.0-Lite | Bytedance Seed | api aggregator | $0.25 | $2.0 | 28.1 | 64808 ms | 1× Nvidia RTX 4000 Ada SFF · INT4 | Open → |
| Qwen: Qwen3.5-9B | Alibaba (Qwen Team) | api aggregator | $0.04 | $0.15 | 20.8 | — | 1× Nvidia GeForce RTX 2060 · INT4 | Open → |
| OpenAI: GPT-5.4 Pro | OpenAI | api aggregator | $30.0 | $180.0 | 14.2 | 34714 ms | API only | Open → |
| OpenAI: GPT-5.4 | OpenAI | api aggregator | $2.5 | $15.0 | 45.2 | 823 ms | API only | Open → |
| Inception: Mercury 2 | Inception | api aggregator | $0.25 | $0.75 | 336.8 | — | API only | Open → |
| OpenAI: GPT-5.3 Chat | OpenAI | api aggregator | $1.75 | $14.0 | 52.6 | 1274 ms | API only | Open → |
| Google: Gemini 3.1 Flash Lite Preview | Google DeepMind | api aggregator | $0.25 | $1.5 | 41.8 | 1334 ms | API only | Open → |
| ByteDance Seed: Seed-2.0-Mini | Bytedance Seed | api aggregator | $0.1 | $0.4 | 30.1 | 116844 ms | API only | Open → |
| Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) | Google DeepMind | api aggregator | $0.5 | $3.0 | 64.0 | 1901 ms | API only | Open → |
| Qwen: Qwen3.5-35B-A3B | Alibaba (Qwen Team) | api aggregator | $0.139 | $1.0 | 31.5 | — | 1× Nvidia RTX A5000 · INT4 | Open → |
| Qwen: Qwen3.5-27B | Alibaba (Qwen Team) | api aggregator | $0.195 | $1.56 | 104.3 | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3.5-122B-A10B | Alibaba (Qwen Team) | api aggregator | $0.26 | $2.08 | 172.1 | 17142 ms | 1× Nvidia H100 · INT4 | Open → |
| GLM-4.6 | Zhipu AI | api aggregator | $0.43 | $1.74 | 45.4 | 23832 ms | 1× AMD MI325 · INT4 | Open → |
| MiniMax: MiniMax M2.1 | MiniMax | api aggregator | $0.29 | $0.95 | 83.0 | — | API only | Open → |
| OpenAI: GPT-5.4 Pro | OpenAI | api aggregator | $30.0 | $180.0 | 14.2 | 34714 ms | API only | Open → |
| Claude Sonnet 4.6 | Anthropic | api aggregator | $3.0 | $15.0 | 37.3 | 1634 ms | API only | Open → |
| GLM-4.7 | Zhipu AI | api aggregator | $0.4 | $1.75 | 66.6 | 15014 ms | 1× Nvidia GTX 1660 Ti · INT4 | Open → |
| Google: Gemini 3 Flash Preview | Google DeepMind | api aggregator | $0.5 | $3.0 | 53.0 | 1801 ms | API only | Open → |
| Xiaomi: MiMo-V2-Flash | Xiaomi | api aggregator | $0.1 | $0.3 | 38.0 | 2002 ms | 1× Nvidia P102-100 · INT4 | Open → |
| NVIDIA: Nemotron 3 Nano 30B A3B | Nvidia | api aggregator | $0.05 | $0.2 | 68.2 | 7329 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| OpenAI: GPT-5.2 Chat | OpenAI | api aggregator | $1.75 | $14.0 | 65.6 | 6583 ms | API only | Open → |
| OpenAI: GPT-5.2 Pro | OpenAI | api aggregator | $21.0 | $168.0 | 6.6 | 18022 ms | API only | Open → |
| OpenAI: GPT-5.2 | OpenAI | api aggregator | $1.75 | $14.0 | 33.2 | 887 ms | API only | Open → |
| Mistral: Devstral 2 2512 | Mistral AI | api aggregator | $0.4 | $2.0 | 54.7 | 549 ms | API only | Open → |
| Relace: Relace Search | Relace | api aggregator | $1.0 | $3.0 | 17.3 | 4394 ms | API only | Open → |
| Z.ai: GLM 4.6V | Zhipu AI | api aggregator | $0.3 | $0.9 | 20.8 | — | API only | Open → |
| Nex AGI: DeepSeek V3.1 Nex N1 | Nex Agi | api aggregator | $0.135 | $0.5 | 43.9 | 915 ms | API only | Open → |
| EssentialAI: Rnj 1 Instruct | Essentialai | api aggregator | $0.15 | $0.15 | 63.6 | 880 ms | API only | Open → |
| Body Builder (beta) | Openrouter | api aggregator | — | — | 40.6 | 11443 ms | API only | Open → |
| OpenAI: GPT-5.1-Codex-Max | OpenAI | api aggregator | $1.25 | $10.0 | 79.1 | 5659 ms | API only | Open → |
| Amazon: Nova 2 Lite | Amazon | api aggregator | $0.3 | $2.5 | 65.1 | 777 ms | API only | Open → |
| Mistral: Ministral 3 14B 2512 | Mistral AI | api aggregator | $0.2 | $0.2 | 79.6 | 507 ms | 1× Nvidia RTX 3080 · INT4 | Open → |
| Mistral: Ministral 3 8B 2512 | Mistral AI | api aggregator | $0.15 | $0.15 | 56.1 | 1151 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Mistral: Ministral 3 3B 2512 | Mistral AI | api aggregator | $0.1 | $0.1 | 66.3 | 538 ms | 1× Nvidia GeForce GTX 1050 · INT4 | Open → |
| Mistral: Mistral Large 3 2512 | Mistral AI | api aggregator | $0.5 | $1.5 | 48.8 | 728 ms | API only | Open → |
| Arcee AI: Trinity Mini | Arcee Ai | api aggregator | $0.045 | $0.15 | 153.5 | 2492 ms | API only | Open → |
| DeepSeek: DeepSeek V3.2 Speciale | DeepSeek | api aggregator | $0.287 | $0.431 | — | — | API only | Open → |
| DeepSeek: DeepSeek V3.2 | DeepSeek | api aggregator | $0.252 | $0.378 | 22.0 | 846 ms | API only | Open → |
| Prime Intellect: INTELLECT-3 | Prime Intellect | api aggregator | $0.2 | $1.1 | 117.9 | 2357 ms | API only | Open → |
| Anthropic: Claude Opus 4.5 | Anthropic | api aggregator | $5.0 | $25.0 | 42.7 | 1501 ms | API only | Open → |
| AllenAI: Olmo 3 32B Think | Allen Institute for AI (AI2) | api aggregator | $0.15 | $0.5 | — | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Google: Nano Banana Pro (Gemini 3 Pro Image Preview) | Google DeepMind | api aggregator | $2.0 | $12.0 | 69.6 | — | API only | Open → |
| Deep Cogito: Cogito v2.1 671B | Deepcogito | api aggregator | $1.25 | $1.25 | 41.7 | 717 ms | 2× AMD MI325 · INT4 | Open → |
| OpenAI: GPT-5.1 | OpenAI | api aggregator | $1.25 | $10.0 | 78.7 | 835 ms | API only | Open → |
| OpenAI: GPT-5.1 Chat | OpenAI | api aggregator | $1.25 | $10.0 | 69.3 | 1240 ms | API only | Open → |
| OpenAI: GPT-5.1-Codex | OpenAI | api aggregator | $1.25 | $10.0 | 51.4 | 1187 ms | API only | Open → |
| OpenAI: GPT-5.1-Codex-Mini | OpenAI | api aggregator | $0.25 | $2.0 | 81.3 | 1325 ms | API only | Open → |
| Kimi K2 Thinking | Moonshot AI | api aggregator | $0.6 | $2.5 | 49.8 | 25704 ms | 4× AMD MI300 · INT4 | Open → |
| Amazon: Nova Premier 1.0 | Amazon | api aggregator | $2.5 | $12.5 | 38.3 | 1279 ms | API only | Open → |
| Perplexity: Sonar Pro Search | Perplexity | api aggregator | $3.0 | $15.0 | 23.8 | 2439 ms | API only | Open → |
| OpenAI: gpt-oss-safeguard-20b | OpenAI | api aggregator | $0.075 | $0.3 | 391.4 | — | API only | Open → |
| MiniMax: MiniMax M2 | MiniMax | api aggregator | $0.255 | $1.0 | 97.9 | — | 1× Nvidia B300 · INT4 | Open → |
| Qwen: Qwen3 VL 32B Instruct | Alibaba (Qwen Team) | api aggregator | $0.104 | $0.416 | 72.4 | 461 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| IBM: Granite 4.0 Micro | IBM Research | api aggregator | $0.017 | $0.112 | 27.7 | 686 ms | API only | Open → |
| Microsoft: Phi 4 Mini Instruct | Microsoft | api aggregator | $0.08 | $0.35 | 93.6 | 996 ms | 1× Nvidia Titan V · INT4 | Open → |
| OpenAI: GPT-5 Image Mini | OpenAI | api aggregator | $2.5 | $2.0 | 42.6 | — | API only | Open → |
| Qwen: Qwen3 VL 8B Thinking | Alibaba (Qwen Team) | api aggregator | $0.117 | $1.365 | 132.7 | 3551 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Qwen: Qwen3 VL 8B Instruct | Alibaba (Qwen Team) | api aggregator | $0.08 | $0.5 | 41.0 | 1423 ms | 1× Nvidia P102-100 · INT4 | Open → |
| OpenAI: GPT-5 Image | OpenAI | api aggregator | $10.0 | $10.0 | 46.1 | — | API only | Open → |
| OpenAI: o3 Deep Research | OpenAI | api aggregator | $10.0 | $40.0 | 69.0 | — | API only | Open → |
| OpenAI: o4 Mini Deep Research | OpenAI | api aggregator | $2.0 | $8.0 | 18.2 | — | API only | Open → |
| NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | Nvidia | api aggregator | $0.1 | $0.4 | 51.6 | — | 1× Nvidia Tesla V100 SXM2 32GB · INT4 | Open → |
| Baidu: ERNIE 4.5 21B A3B Thinking | Baidu | api aggregator | $0.07 | $0.28 | — | — | 1× Nvidia RTX 4060 Ti · INT4 | Open → |
| Google: Nano Banana (Gemini 2.5 Flash Image) | Google DeepMind | api aggregator | $0.3 | $2.5 | 69.6 | 1094 ms | API only | Open → |
| Qwen: Qwen3 VL 30B A3B Thinking | Alibaba (Qwen Team) | api aggregator | $0.13 | $1.56 | 61.8 | 8642 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 VL 30B A3B Instruct | Alibaba (Qwen Team) | api aggregator | $0.13 | $0.52 | 49.3 | 846 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| OpenAI: GPT-5 Pro | OpenAI | api aggregator | $15.0 | $120.0 | 9.5 | — | API only | Open → |
| Anthropic: Claude Sonnet 4.5 | Anthropic | api aggregator | $3.0 | $15.0 | 33.0 | 1621 ms | API only | Open → |
| DeepSeek: DeepSeek V3.2 Exp | DeepSeek | api aggregator | $0.27 | $0.41 | 6.8 | 10060 ms | 2× AMD MI325 · INT4 | Open → |
| TheDrummer: Cydonia 24B V4.1 | Thedrummer | api aggregator | $0.3 | $0.5 | 44.7 | 526 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| Relace: Relace Apply 3 | Relace | api aggregator | $0.85 | $1.25 | — | — | API only | Open → |
| Google: Gemini 2.5 Flash Lite Preview 09-2025 | Google DeepMind | api aggregator | $0.1 | $0.4 | 30.5 | 1139 ms | API only | Open → |
| Qwen: Qwen3 VL 235B A22B Thinking | Alibaba (Qwen Team) | api aggregator | $0.26 | $2.6 | 48.0 | 8958 ms | 1× AMD MI300 · INT4 | Open → |
| Qwen: Qwen3 VL 235B A22B Instruct | Alibaba (Qwen Team) | api aggregator | $0.2 | $0.88 | 52.7 | 721 ms | 1× AMD MI300 · INT4 | Open → |
| Qwen: Qwen3 Max | Alibaba (Qwen Team) | api aggregator | $0.78 | $3.9 | 25.9 | 1232 ms | 1× AMD MI300 · INT4 | Open → |
| Qwen: Qwen3 Coder Plus | Alibaba (Qwen Team) | api aggregator | $0.65 | $3.25 | 32.4 | 892 ms | API only | Open → |
| OpenAI: GPT-5 Codex | OpenAI | api aggregator | $1.25 | $10.0 | 168.8 | 2586 ms | API only | Open → |
| DeepSeek: DeepSeek V3.1 Terminus | DeepSeek | api aggregator | $0.27 | $0.95 | 17.8 | 1622 ms | API only | Open → |
| Tongyi DeepResearch 30B A3B | Alibaba (Qwen Team) | api aggregator | $0.09 | $0.45 | 75.8 | 4308 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 Coder Flash | Alibaba (Qwen Team) | api aggregator | $0.195 | $0.975 | 54.8 | 903 ms | API only | Open → |
| Qwen: Qwen3 Next 80B A3B Thinking | Alibaba (Qwen Team) | api aggregator | $0.0975 | $0.78 | 145.0 | 5387 ms | 1× Nvidia A16 · INT4 | Open → |
| Qwen: Qwen3 Next 80B A3B Instruct | Alibaba (Qwen Team) | api aggregator | $0.09 | $1.1 | 80.8 | 666 ms | 1× Nvidia A16 · INT4 | Open → |
| Qwen: Qwen Plus 0728 | Alibaba (Qwen Team) | api aggregator | $0.26 | $0.78 | 53.0 | 474 ms | API only | Open → |
| NVIDIA: Nemotron Nano 9B V2 | Nvidia | api aggregator | $0.04 | $0.16 | 119.8 | 4061 ms | 1× Nvidia GeForce RTX 2060 · INT4 | Open → |
| MoonshotAI: Kimi K2 0905 | Moonshot AI | api aggregator | $0.6 | $2.5 | 18.0 | 1228 ms | 4× AMD MI300 · INT4 | Open → |
| Qwen: Qwen3 30B A3B Thinking 2507 | Alibaba (Qwen Team) | api aggregator | $0.08 | $0.4 | 85.8 | 6367 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Nous: Hermes 4 70B | Nous Research | api aggregator | $0.13 | $0.4 | 49.7 | 867 ms | 1× Nvidia L40S · INT4 | Open → |
| Nous: Hermes 4 405B | Nous Research | api aggregator | $1.0 | $3.0 | 27.8 | 753 ms | 1× AMD MI325 · INT4 | Open → |
| DeepSeek: DeepSeek V3.1 | DeepSeek | api aggregator | $0.21 | $0.79 | 46.1 | 661 ms | API only | Open → |
| Mistral: Mistral Medium 3.1 | Mistral AI | api aggregator | $0.4 | $2.0 | 58.4 | 669 ms | API only | Open → |
| Baidu: ERNIE 4.5 21B A3B | Baidu | api aggregator | $0.07 | $0.28 | — | — | 1× Nvidia RTX 4060 Ti · INT4 | Open → |
| Baidu: ERNIE 4.5 VL 28B A3B | Baidu | api aggregator | $0.14 | $0.56 | — | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Z.ai: GLM 4.5V | Zhipu AI | api aggregator | $0.6 | $1.8 | 4.2 | — | 1× Nvidia GTX 1660 Ti · INT4 | Open → |
| AI21: Jamba Large 1.7 | Ai21 | api aggregator | $2.0 | $8.0 | 37.7 | 1441 ms | API only | Open → |
| OpenAI: GPT-5 Chat | OpenAI | api aggregator | $1.25 | $10.0 | 72.6 | 730 ms | API only | Open → |
| OpenAI: GPT-5 Mini | OpenAI | api aggregator | $0.25 | $2.0 | 52.7 | — | API only | Open → |
| OpenAI: GPT-5 Nano | OpenAI | api aggregator | $0.05 | $0.4 | 50.1 | — | API only | Open → |
| Anthropic: Claude Opus 4.1 | Anthropic | api aggregator | $15.0 | $75.0 | 14.3 | 3307 ms | API only | Open → |
| Mistral: Codestral 2508 | Mistral AI | api aggregator | $0.3 | $0.9 | 73.8 | 516 ms | API only | Open → |
| Qwen: Qwen3 Coder 30B A3B Instruct | Alibaba (Qwen Team) | api aggregator | $0.07 | $0.27 | 49.5 | 1156 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 30B A3B Instruct 2507 | Alibaba (Qwen Team) | api aggregator | $0.09 | $0.3 | 75.6 | 517 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| GLM-4.5 | Zhipu AI | api aggregator | $0.6 | $2.2 | 47.7 | 7589 ms | 1× AMD MI325 · INT4 | Open → |
| Qwen: Qwen3 235B A22B Thinking 2507 | Alibaba (Qwen Team) | api aggregator | $0.1495 | $1.495 | 34.5 | 13210 ms | 1× AMD MI300 · INT4 | Open → |
| Z.ai: GLM 4 32B | Zhipu AI | api aggregator | $0.1 | $0.1 | 36.1 | 2263 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 Coder 480B A35B | Alibaba (Qwen Team) | api aggregator | $0.22 | $1.8 | 0.6 | — | 2× AMD MI300 · INT4 | Open → |
| ByteDance: UI-TARS 7B | Bytedance | api aggregator | $0.1 | $0.2 | 59.4 | 541 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Google: Gemini 2.5 Flash Lite | Google DeepMind | api aggregator | $0.1 | $0.4 | 77.4 | 717 ms | API only | Open → |
| Qwen: Qwen3 235B A22B Instruct 2507 | Alibaba (Qwen Team) | api aggregator | $0.071 | $0.1 | 21.2 | 883 ms | 1× AMD MI300 · INT4 | Open → |
| Switchpoint Router | Switchpoint | api aggregator | $0.85 | $3.4 | — | — | API only | Open → |
| Mistral: Devstral Medium | Mistral AI | api aggregator | $0.4 | $2.0 | 49.1 | 557 ms | 1× Nvidia RTX A4000 · INT4 | Open → |
| Mistral: Devstral Small 1.1 | Mistral AI | api aggregator | $0.1 | $0.3 | 96.0 | 529 ms | 1× Nvidia RTX A4000 · INT4 | Open → |
| Tencent: Hunyuan A13B Instruct | Tencent | api aggregator | $0.14 | $0.57 | 32.4 | 1701 ms | 1× Nvidia A16 · INT4 | Open → |
| Morph: Morph V3 Large | Morph | api aggregator | $0.9 | $1.9 | 41.9 | 957 ms | 1× Nvidia RTX 6000 Ada · INT4 | Open → |
| Morph: Morph V3 Fast | Morph | api aggregator | $0.8 | $1.2 | 188.3 | 851 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Baidu: ERNIE 4.5 VL 424B A47B | Baidu | api aggregator | $0.42 | $1.25 | 29.7 | 1532 ms | 1× Nvidia B300 · INT4 | Open → |
| Baidu: ERNIE 4.5 300B A47B | Baidu | api aggregator | $0.28 | $1.1 | 16.3 | 1756 ms | 1× AMD MI300 · INT4 | Open → |
| Mistral: Mistral Small 3.2 24B | Mistral AI | api aggregator | $0.075 | $0.2 | 30.9 | 979 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| MiniMax: MiniMax M1 | MiniMax | api aggregator | $0.4 | $2.2 | 87.4 | — | 1× Nvidia B300 · INT4 | Open → |
| Google: Gemini 2.5 Flash | Google DeepMind | api aggregator | $0.3 | $2.5 | 54.1 | 783 ms | API only | Open → |
| OpenAI: o3 Pro | OpenAI | api aggregator | $20.0 | $80.0 | 37.8 | 13225 ms | API only | Open → |
| Google: Gemini 2.5 Pro Preview 06-05 | Google DeepMind | api aggregator | $1.25 | $10.0 | 65.6 | 7257 ms | API only | Open → |
| DeepSeek: R1 0528 | DeepSeek | api aggregator | $0.5 | $2.15 | 20.2 | 15159 ms | 2× AMD MI325 · INT4 | Open → |
| Anthropic: Claude Opus 4 | Anthropic | api aggregator | $15.0 | $75.0 | 13.6 | 3411 ms | API only | Open → |
| Anthropic: Claude Sonnet 4 | Anthropic | api aggregator | $3.0 | $15.0 | 36.4 | 1911 ms | API only | Open → |
| Google: Gemma 3n 4B | Google DeepMind | api aggregator | $0.06 | $0.12 | 30.6 | 1233 ms | API only | Open → |
| Mistral: Mistral Medium 3 | Mistral AI | api aggregator | $0.4 | $2.0 | 37.5 | 672 ms | API only | Open → |
| Google: Gemini 2.5 Pro Preview 05-06 | Google DeepMind | api aggregator | $1.25 | $10.0 | 50.5 | 9615 ms | API only | Open → |
| Arcee AI: Spotlight | Arcee Ai | api aggregator | $0.18 | $0.18 | — | — | 1× Nvidia Titan V · FP8 | Open → |
| Arcee AI: Maestro Reasoning | Arcee Ai | api aggregator | $0.9 | $3.3 | — | — | API only | Open → |
| Arcee AI: Virtuoso Large | Arcee Ai | api aggregator | $0.75 | $1.2 | — | — | API only | Open → |
| Arcee AI: Coder Large | Arcee Ai | api aggregator | $0.5 | $0.8 | — | — | API only | Open → |
| Meta: Llama Guard 4 12B | Meta AI | api aggregator | $0.18 | $0.18 | 1.3 | 1500 ms | 1× Nvidia GTX 1070 Ti · INT4 | Open → |
| Qwen: Qwen3 30B A3B | Alibaba (Qwen Team) | api aggregator | $0.09 | $0.45 | 66.5 | 6445 ms | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 8B | Alibaba (Qwen Team) | api aggregator | $0.05 | $0.4 | 64.3 | 7088 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Qwen: Qwen3 14B | Alibaba (Qwen Team) | api aggregator | $0.1 | $0.24 | 15.0 | 27050 ms | 1× Nvidia RTX 3080 · INT4 | Open → |
| Qwen: Qwen3 32B | Alibaba (Qwen Team) | api aggregator | $0.08 | $0.28 | 43.5 | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| Qwen: Qwen3 235B A22B | Alibaba (Qwen Team) | api aggregator | $0.455 | $1.82 | 71.2 | 8619 ms | 1× AMD MI300 · INT4 | Open → |
| OpenAI: o4 Mini High | OpenAI | api aggregator | $1.1 | $4.4 | 44.9 | — | API only | Open → |
| OpenAI: o3 | OpenAI | api aggregator | $2.0 | $8.0 | 86.7 | 5535 ms | API only | Open → |
| OpenAI: o4 Mini | OpenAI | api aggregator | $1.1 | $4.4 | 42.1 | — | API only | Open → |
| OpenAI: GPT-4.1 | OpenAI | api aggregator | $2.0 | $8.0 | 58.6 | 847 ms | API only | Open → |
| OpenAI: GPT-4.1 Mini | OpenAI | api aggregator | $0.4 | $1.6 | 59.1 | 930 ms | API only | Open → |
| OpenAI: GPT-4.1 Nano | OpenAI | api aggregator | $0.1 | $0.4 | 78.4 | 785 ms | API only | Open → |
| AlfredPros: CodeLLaMa 7B Instruct Solidity | Alfredpros | api aggregator | $0.8 | $1.2 | — | — | 1× Nvidia P102-100 · INT4 | Open → |
| Meta: Llama 4 Maverick | Meta AI | api aggregator | $0.15 | $0.6 | 38.1 | 672 ms | API only | Open → |
| Meta: Llama 4 Scout | Meta AI | api aggregator | $0.08 | $0.3 | 32.4 | 1020 ms | API only | Open → |
| DeepSeek: DeepSeek V3 0324 | DeepSeek | api aggregator | $0.2 | $0.77 | 20.9 | 733 ms | API only | Open → |
| OpenAI: o1-pro | OpenAI | api aggregator | $150.0 | $600.0 | 20.9 | — | API only | Open → |
| Mistral: Mistral Small 3.1 24B | Mistral AI | api aggregator | $0.351 | $0.555 | 19.6 | 672 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| Google: Gemma 3 12B | Google DeepMind | api aggregator | $0.04 | $0.13 | 53.2 | 506 ms | API only | Open → |
| Cohere: Command A | Cohere | api aggregator | $2.5 | $10.0 | 27.4 | 933 ms | API only | Open → |
| OpenAI: GPT-4o-mini Search Preview | OpenAI | api aggregator | $0.15 | $0.6 | 26.1 | 2145 ms | API only | Open → |
| OpenAI: GPT-4o Search Preview | OpenAI | api aggregator | $2.5 | $10.0 | 12.5 | 5998 ms | API only | Open → |
| Reka Flash 3 | Rekaai | api aggregator | $0.1 | $0.2 | 4.7 | 100812 ms | API only | Open → |
| TheDrummer: Skyfall 36B V2 | Thedrummer | api aggregator | $0.55 | $0.8 | 46.6 | 786 ms | 1× Nvidia RTX A5000 · INT4 | Open → |
| Perplexity: Sonar Reasoning Pro | Perplexity | api aggregator | $2.0 | $8.0 | 0.0 | — | API only | Open → |
| Perplexity: Sonar Pro | Perplexity | api aggregator | $3.0 | $15.0 | 35.1 | 1991 ms | API only | Open → |
| Perplexity: Sonar Deep Research | Perplexity | api aggregator | $2.0 | $8.0 | 0.0 | — | API only | Open → |
| Google: Gemini 2.0 Flash Lite | Google DeepMind | api aggregator | $0.075 | $0.3 | 90.3 | 569 ms | API only | Open → |
| Mistral: Saba | Mistral AI | api aggregator | $0.2 | $0.6 | 100.3 | 611 ms | API only | Open → |
| Llama Guard 3 8B | Meta AI | api aggregator | $0.484 | $0.03 | — | — | 1× Nvidia P102-100 · INT4 | Open → |
| OpenAI: o3 Mini High | OpenAI | api aggregator | $1.1 | $4.4 | 93.3 | — | API only | Open → |
| Google: Gemini 2.0 Flash | Google DeepMind | api aggregator | $0.1 | $0.4 | 37.6 | 1894 ms | API only | Open → |
| AionLabs: Aion-1.0 | Aion Labs | api aggregator | $4.0 | $8.0 | 24.2 | 1803 ms | API only | Open → |
| AionLabs: Aion-1.0-Mini | Aion Labs | api aggregator | $0.7 | $1.4 | 22.1 | 1893 ms | API only | Open → |
| AionLabs: Aion-RP 1.0 (8B) | Aion Labs | api aggregator | $0.8 | $1.6 | 34.1 | 987 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Qwen: Qwen2.5 VL 72B Instruct | Alibaba (Qwen Team) | api aggregator | $0.25 | $0.75 | 28.7 | 1240 ms | 1× Nvidia L40S · INT4 | Open → |
| Qwen: Qwen-Plus | Alibaba (Qwen Team) | api aggregator | $0.26 | $0.78 | 46.7 | 438 ms | API only | Open → |
| OpenAI: o3 Mini | OpenAI | api aggregator | $1.1 | $4.4 | 231.5 | — | API only | Open → |
| Mistral: Mistral Small 3 | Mistral AI | api aggregator | $0.05 | $0.08 | 38.9 | 581 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| DeepSeek R1 Distill Qwen 32B | DeepSeek | api aggregator | $0.29 | $0.29 | 21.0 | — | 1× Nvidia RTX A5000 · INT4 | Open → |
| Perplexity: Sonar | Perplexity | api aggregator | $1.0 | $1.0 | 38.3 | 1974 ms | API only | Open → |
| MiniMax: MiniMax-01 | MiniMax | api aggregator | $0.2 | $1.1 | 27.7 | 898 ms | API only | Open → |
| Sao10K: Llama 3.1 70B Hanami x1 | Sao10k | api aggregator | $3.0 | $3.0 | 14.7 | 1395 ms | 1× Nvidia L40S · INT4 | Open → |
| DeepSeek: DeepSeek V3 | DeepSeek | api aggregator | $0.32 | $0.89 | 26.3 | 1577 ms | API only | Open → |
| Sao10K: Llama 3.3 Euryale 70B | Sao10k | api aggregator | $0.65 | $0.75 | 16.6 | 1559 ms | 1× Nvidia L40S · INT4 | Open → |
| OpenAI: o1 | OpenAI | api aggregator | $15.0 | $60.0 | 84.3 | — | API only | Open → |
| Cohere: Command R7B (12-2024) | Cohere | api aggregator | $0.0375 | $0.15 | 50.2 | 835 ms | API only | Open → |
| Llama 3.3 70B | Meta AI | api aggregator | $0.1 | $0.32 | 23.5 | 1167 ms | 1× Nvidia L40S · INT4 | Open → |
| Amazon: Nova Lite 1.0 | Amazon | api aggregator | $0.06 | $0.24 | 80.3 | 584 ms | API only | Open → |
| Amazon: Nova Micro 1.0 | Amazon | api aggregator | $0.035 | $0.14 | 54.0 | 1152 ms | API only | Open → |
| Amazon: Nova Pro 1.0 | Amazon | api aggregator | $0.8 | $3.2 | 58.7 | 1044 ms | API only | Open → |
| OpenAI: GPT-4o (2024-11-20) | OpenAI | api aggregator | $2.5 | $10.0 | 78.6 | 715 ms | API only | Open → |
| Mistral Large 2 | Mistral AI | api aggregator | $2.0 | $6.0 | 26.1 | 591 ms | 1× Nvidia H100 · INT4 | Open → |
| Mistral Large 2407 | Mistral AI | api aggregator | $2.0 | $6.0 | 50.8 | 719 ms | API only | Open → |
| Mistral: Pixtral Large 2411 | Mistral AI | api aggregator | $2.0 | $6.0 | 50.8 | 649 ms | API only | Open → |
| TheDrummer: UnslopNemo 12B | Thedrummer | api aggregator | $0.4 | $0.4 | 55.2 | 801 ms | 1× Nvidia GTX 1070 Ti · INT4 | Open → |
| Magnum v4 72B | Anthracite Org | api aggregator | $3.0 | $5.0 | 17.8 | 1123 ms | 1× Nvidia L40S · INT4 | Open → |
| Qwen: Qwen2.5 7B Instruct | Alibaba (Qwen Team) | api aggregator | $0.04 | $0.1 | 30.2 | 1742 ms | 1× Nvidia P102-100 · INT4 | Open → |
| Inflection: Inflection 3 Productivity | Inflection | api aggregator | $2.5 | $10.0 | 30.6 | 3314 ms | API only | Open → |
| Inflection: Inflection 3 Pi | Inflection | api aggregator | $2.5 | $10.0 | 25.9 | 3183 ms | API only | Open → |
| TheDrummer: Rocinante 12B | Thedrummer | api aggregator | $0.17 | $0.43 | 47.0 | 1176 ms | 1× Nvidia GTX 1070 Ti · INT4 | Open → |
| Cohere: Command R (08-2024) | Cohere | api aggregator | $0.15 | $0.6 | 32.3 | 1269 ms | API only | Open → |
| Cohere: Command R+ (08-2024) | Cohere | api aggregator | $2.5 | $10.0 | 32.8 | 826 ms | API only | Open → |
| Sao10K: Llama 3.1 Euryale 70B v2.2 | Sao10k | api aggregator | $0.85 | $0.85 | 31.8 | 658 ms | 1× Nvidia L40S · INT4 | Open → |
| Nous: Hermes 3 405B Instruct | Nous Research | api aggregator | $1.0 | $1.0 | 23.4 | 577 ms | 1× AMD MI325 · INT4 | Open → |
| Sao10K: Llama 3 8B Lunaris | Sao10k | api aggregator | $0.04 | $0.05 | 18.6 | 3122 ms | 1× Nvidia P102-100 · INT4 | Open → |
| OpenAI: GPT-4o (2024-08-06) | OpenAI | api aggregator | $2.5 | $10.0 | 54.3 | 1138 ms | API only | Open → |
| OpenAI: GPT-4o-mini (2024-07-18) | OpenAI | api aggregator | $0.15 | $0.6 | 36.6 | 1110 ms | API only | Open → |
| GPT-4o Mini | OpenAI | api aggregator | $0.15 | $0.6 | 34.8 | 1446 ms | API only | Open → |
| Google: Gemma 2 27B | Google DeepMind | api aggregator | $0.65 | $0.65 | 16.6 | 3406 ms | API only | Open → |
| Sao10k: Llama 3 Euryale 70B v2.1 | Sao10k | api aggregator | $1.48 | $1.48 | 35.5 | 1009 ms | 1× Nvidia L40S · INT4 | Open → |
| NousResearch: Hermes 2 Pro - Llama-3 8B | Nous Research | api aggregator | $0.14 | $0.14 | 0.0 | — | 1× Nvidia P102-100 · INT4 | Open → |
| OpenAI: GPT-4o (2024-05-13) | OpenAI | api aggregator | $5.0 | $15.0 | 69.4 | 953 ms | API only | Open → |
| Meta: Llama 3 70B Instruct | Meta AI | api aggregator | $0.51 | $0.74 | 21.1 | 902 ms | 1× Nvidia L40S · INT4 | Open → |
| Meta: Llama 3 8B Instruct | Meta AI | api aggregator | $0.04 | $0.04 | 49.0 | 646 ms | 1× Nvidia P102-100 · INT4 | Open → |
| WizardLM-2 8x22B | Microsoft | api aggregator | $0.62 | $0.62 | 16.9 | 939 ms | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| GPT-4 Turbo | OpenAI | api aggregator | $10.0 | $30.0 | 32.5 | 874 ms | API only | Open → |
| Anthropic: Claude 3 Haiku | Anthropic | api aggregator | $0.25 | $1.25 | 64.7 | 632 ms | API only | Open → |
| Mistral Large | Mistral AI | api aggregator | $2.0 | $6.0 | 45.8 | 720 ms | API only | Open → |
| OpenAI: GPT-3.5 Turbo (older v0613) | OpenAI | api aggregator | $1.0 | $2.0 | 61.2 | 813 ms | API only | Open → |
| Auto Router | Openrouter | api aggregator | — | — | 46.1 | 2017 ms | API only | Open → |
| Mistral: Mistral 7B Instruct v0.1 | Mistral AI | api aggregator | $0.11 | $0.19 | 14.8 | 631 ms | 1× Nvidia P102-100 · INT4 | Open → |
| OpenAI: GPT-3.5 Turbo Instruct | OpenAI | api aggregator | $1.5 | $2.0 | 72.6 | 570 ms | API only | Open → |
| OpenAI: GPT-3.5 Turbo 16k | OpenAI | api aggregator | $3.0 | $4.0 | 25.0 | 3842 ms | API only | Open → |
| Mancer: Weaver (alpha) | Mancer | api aggregator | $0.75 | $1.0 | 54.4 | 1301 ms | API only | Open → |
| ReMM SLERP 13B | Undi95 | api aggregator | $0.45 | $0.65 | 23.2 | 978 ms | 1× Nvidia RTX 3080 · INT4 | Open → |
| MythoMax 13B | Gryphe | api aggregator | $0.06 | $0.06 | 20.0 | 1930 ms | 1× Nvidia RTX 3080 · INT4 | Open → |
| OpenAI: GPT-3.5 Turbo | OpenAI | api aggregator | $0.5 | $1.5 | 43.4 | 1101 ms | API only | Open → |
| OpenAI: GPT-4 | OpenAI | api aggregator | $30.0 | $60.0 | 46.9 | 1115 ms | API only | Open → |
| OpenAI: GPT-4 Turbo (older v1106) | OpenAI | api aggregator | $10.0 | $30.0 | — | — | API only | Open → |
| OpenAI: GPT-4 (older v0314) | OpenAI | api aggregator | $30.0 | $60.0 | — | — | API only | Open → |
| Gemma 3 4b it | Google DeepMind | api aggregator | $0.04 | $0.08 | — | — | 1× Nvidia Titan V · INT4 | Open → |
| Gemma 3 27B It | Google DeepMind | api aggregator | $0.08 | $0.16 | — | — | 1× Nvidia RTX 4000 Ada · INT4 | Open → |
| meta-llama/Llama-3.3-70B-Instruct | Meta AI | api aggregator | $0.1 | $0.32 | — | — | 1× Nvidia A40 · INT4 | Open → |
| Meta Llama 3.2 1B Instruct | Meta AI | api aggregator | $0.027 | $0.201 | — | — | 1× Nvidia Titan V · FP16 | Open → |
| Meta Llama 3.2 3B Instruct | Meta AI | api aggregator | $0.0509 | $0.335 | — | — | 1× Nvidia GeForce GTX 1050 · INT4 | Open → |
| OpenAI: GPT-4 Turbo Preview | OpenAI | api aggregator | $10.0 | $30.0 | 9.6 | 10721 ms | API only | Open → |
| Llama-3.2-11B-Vision-Instruct | Meta AI | api aggregator | $0.245 | $0.245 | — | — | 1× Nvidia RTX 3070 Ti · INT4 | Open → |
| Hermes-3-Llama-3.1-70B | Nous Research | api aggregator | $0.3 | $0.3 | — | — | 1× Nvidia A40 · INT4 | Open → |
| Claude Haiku 4.5 | Anthropic | api aggregator | $1.0 | $5.0 | 39.5 | 3388 ms | API only | Open → |
| GPT-5 | OpenAI | api aggregator | $1.25 | $10.0 | 30.3 | — | API only | Open → |
| GPT-OSS 120B | OpenAI | api aggregator | $0.039 | $0.18 | 57.5 | 8629 ms | 1× Nvidia H100 · INT4 | Open → |