Every way
to use the major models.
Closed models like Claude and GPT — link to the cheapest API provider. Open-weights like Llama, Kimi, DeepSeek — choose hosted inference or self-host on rented GPUs.
Find the sweet spot.
Higher = stronger benchmark composite · further left = cheaper input
34 models match — reset filters
Closed / API-only models.
Direct API, aggregator (OpenRouter, Bedrock), or chat UI.
Claude Opus 4.7
Frontier reasoning and long-form coding from Anthropic.
Claude Sonnet 4.6
Best price-performance from Anthropic. Default for production agents.
Claude 3.5 Sonnet
Anthropic's 3.5 generation — still in active production.
Claude Haiku 4.5
Fast, cheap Claude variant for high-throughput inference.
Claude 3.5 Haiku
Fast/cheap Claude 3.5 variant — production fallback for Haiku 4.5.
GPT-5
OpenAI's frontier multimodal reasoning model.
GPT-4 Turbo
OpenAI's pre-GPT-5 flagship — still extensively deployed.
AI21: Jamba Large 1.7
Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall effi...
Amazon: Nova Micro 1.0
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low c...
Body Builder (beta)
Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI mod...
Cohere: Command A
Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, mult...
Cohere: Command R (08-2024)
command-r-08-2024 is an update of the [Command R](/models/cohere/command-r) with improved performance for multilingual retrieval-augmente...
Cohere: Command R+ (08-2024)
command-r-plus-08-2024 is an update of the [Command R+](/models/cohere/command-r-plus) with roughly 50% higher throughput and 25% lower l...
Cohere: Command R7B (12-2024)
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, an...
Google: Gemma 2 27B
Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). ...
Google: Gemma 3n 4B
Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It support...
Inflection: Inflection 3 Pi
Inflection 3 Pi powers Inflection's [Pi](https://pi.ai) chatbot, including backstory, emotional intelligence, productivity, and safety. I...
Inflection: Inflection 3 Productivity
Inflection 3 Productivity is optimized for following instructions. It is better for tasks requiring JSON output or precise adherence to p...
OpenAI: GPT-3.5 Turbo
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and tradition...
OpenAI: GPT-3.5 Turbo 16k
This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single reque...
OpenAI: GPT-3.5 Turbo Instruct
This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Se...
OpenAI: GPT-3.5 Turbo (older v0613)
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and tradition...
OpenAI: GPT-4
OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy tha...
OpenAI: GPT-4 (older v0314)
GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data:...
OpenAI: GPT-4o-mini Search Preview
GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search ...
OpenAI: GPT-4o Search Preview
GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.
OpenAI: GPT-4 Turbo (older v1106)
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Training data: up to ...
OpenAI: GPT-4 Turbo Preview
The preview GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Traini...
OpenAI: gpt-oss-safeguard-20b
gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts ...
OpenAI: o3 Mini
OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and...
OpenAI: o3 Mini High
OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient langua...
Owl Alpha
Owl Alpha is a high-performance foundation model designed for agentic workloads. Natively supports tool use, and long-context tasks, with...
Pareto Code Router
The Pareto Router maintains a tiered shortlist of strong coding models, ranked by [Artificial Analysis](https://artificialanalysis.ai/) c...
Perplexity: Sonar Deep Research
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It aut...