AI models

Every way
to use the major models.

Closed models like Claude and GPT — link to the cheapest API provider. Open-weights like Llama, Kimi, DeepSeek — choose hosted inference or self-host on rented GPUs.

166 tracked · 68 open weights · 98 closed APIs · cheapest input $0.01/M
Quality × Price

Find the sweet spot.

Higher = stronger benchmark composite · further left = cheaper input

Loading...

166 models match — reset filters

Open-weights models.

Run yourself on cheap GPUs, or use a hosted-inference provider.

Gemma 3 27B

27B
by Google DeepMind · Gemma · 128,000 ctx

Google's open-weight multimodal LLM — efficient and license-permissive.

Gemma 3 12B

12B
by Google DeepMind · Gemma · 128,000 ctx

12B Gemma 3 — multimodal, single-GPU target.

Gemma 3 4B

4B
by Google DeepMind · Gemma · 128,000 ctx

4B Gemma 3 — laptop multimodal.

Kimi K2.5

1000B
by Moonshot AI · Kimi · 256,000 ctx

Multimodal agentic variant — adds a vision encoder to the K2 backbone.

Arcee AI: Spotlight

1B
by Arcee Ai · 131,072 ctx

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text groundi...

Baidu: ERNIE 4.5 VL 28B A3B

28B
by Baidu · 131,072 ctx

A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional te...

Baidu: ERNIE 4.5 VL 424B A47B

424B
by Baidu · 131,072 ctx

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with...

Baidu: Qianfan-OCR-Fast

multimodal
by Baidu · 65,536 ctx

Qianfan-OCR-Fast is a domain-specific multimodal large model purpose-built for OCR. By leveraging specialized OCR training data while pre...

ByteDance Seed: Seed 1.6

200B
by Bytedance Seed · 262,144 ctx

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinki...

ByteDance Seed: Seed 1.6 Flash

multimodal
by Bytedance Seed · 262,144 ctx

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It featu...

ByteDance Seed: Seed-2.0-Lite

32B
by Bytedance Seed · 262,144 ctx

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities while offering n...

ByteDance Seed: Seed-2.0-Mini

multimodal
by Bytedance Seed · 262,144 ctx

Seed-2.0-mini targets latency-sensitive, high-concurrency, and cost-sensitive scenarios, emphasizing fast response and flexible inference...

ByteDance: UI-TARS 7B

7B
by Bytedance · 128,000 ctx

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, web browsers, mobil...

gemini-3.1-pro

multimodal
by Google DeepMind · 1,000,000 ctx

Bring any idea to life with state-of-the-art reasoning to help you learn, build, and plan anything. Best for complex tasks and bringing c...

Kimi K2.5

multimodal
by Moonshot AI · 262,144 ctx

Kimi K2.5 is Moonshot AI's flagship agentic model and a new SOTA open model. It unifies vision and text, thinking and non-thinking modes,...

Kimi K2.6

multimodal
by Moonshot AI · 262,144 ctx

Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven d...

Llama-3.2-11B-Vision-Instruct

11B
by Meta AI · 131,072 ctx

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It exc...

Meta: Llama 4 Maverick

multimodal
by Meta AI · 1,048,576 ctx

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architec...

Meta: Llama 4 Scout

multimodal
by Meta AI · 10,000,000 ctx

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of ...

Meta: Llama Guard 4 12B

12B
by Meta AI · 163,840 ctx

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous v...

MiniMax: MiniMax-01

multimodal
by MiniMax · 1,000,192 ctx

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, wi...

Mistral: Ministral 3 14B 2512

14B
by Mistral AI · 262,144 ctx

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistra...

Mistral: Ministral 3 3B 2512

3B
by Mistral AI · 131,072 ctx

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

Mistral: Ministral 3 8B 2512

8B
by Mistral AI · 262,144 ctx

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Mistral: Mistral Large 3 2512

multimodal
by Mistral AI · 262,144 ctx

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active paramete...

Mistral: Mistral Medium 3

multimodal
by Mistral AI · 131,072 ctx

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly r...

Mistral: Mistral Medium 3.1

multimodal
by Mistral AI · 131,072 ctx

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to del...

Mistral: Mistral Medium 3.5

multimodal
by Mistral AI · 262,144 ctx

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with text output, and i...

Mistral: Mistral Small 3.1 24B

24B
by Mistral AI · 128,000 ctx

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal...

Mistral: Mistral Small 3.2 24B

24B
by Mistral AI · 128,000 ctx

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduct...

Mistral: Mistral Small 4

multimodal
by Mistral AI · 262,144 ctx

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into ...

Mistral: Pixtral Large 2411

multimodal
by Mistral AI · 131,072 ctx

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The mo...

Mistral-Small-3.2-24B-Instruct-2506

24B
by Mistral AI · 128,000 ctx

Mistral-Small-3.2-24B-Instruct is a drop-in upgrade over the 3.1 release, with markedly better instruction following, roughly half the in...

MoonshotAI Kimi Latest

1000B
by Moonshot AI · 262,144 ctx

This model always redirects to the latest model in the MoonshotAI Kimi family.

Nemotron-3-Nano-Omni-30B-A3B-Reasoning

30B
by Nvidia · 262,144 ctx

Nemotron 3 Nano Omni is an open multimodal model built on a hybrid Mixture-of-Experts (MoE) architecture, engineered for high efficiency ...

Perceptron: Perceptron Mk1

multimodal
by Perceptron · 32,768 ctx

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It accepts image and ...

Qwen3.5-0.8B

8B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.5-0.8B is Alibaba's smallest model in the Qwen3.5 series, featuring a hybrid Gated Delta Networks and sparse Mixture-of-Experts arc...

Qwen3.5-2B

2B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.5-2B is a compact yet capable model from Alibaba's Qwen3.5 series. It features a 262K token context window, support for 201 languag...

Qwen3.5-4B

4B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.5-4B is a mid-size model from Alibaba's Qwen3.5 series that delivers a strong balance of performance and efficiency. It features a ...

Qwen: Qwen2.5 VL 72B Instruct

72B
by Alibaba (Qwen Team) · 131,072 ctx

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing ...

Qwen: Qwen3.5-122B-A10B

122B
by Alibaba (Qwen Team) · 262,144 ctx

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a ...

Qwen: Qwen3.5-27B

27B
by Alibaba (Qwen Team) · 262,144 ctx

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balanc...

Qwen: Qwen3.5-35B-A3B

35B
by Alibaba (Qwen Team) · 262,144 ctx

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechani...

Qwen: Qwen3.5 397B A17B

397B
by Alibaba (Qwen Team) · 262,144 ctx

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism ...

Qwen: Qwen3.5-9B

9B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understandi...

Qwen: Qwen3.5-Flash

multimodal
by Alibaba (Qwen Team) · 1,000,000 ctx

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sp...

Qwen: Qwen3.5 Plus 2026-02-15

multimodal
by Alibaba (Qwen Team) · 1,000,000 ctx

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with...

Qwen: Qwen3.5 Plus 2026-04-20

235B
by Alibaba (Qwen Team) · 1,000,000 ctx

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces t...

Qwen: Qwen3.6 27B

27B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid mult...

Qwen: Qwen3.6 35B A3B

35B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters pe...

Qwen: Qwen3.6 Flash

multimodal
by Alibaba (Qwen Team) · 1,000,000 ctx

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M toke...

Qwen: Qwen3.6 Plus

multimodal
by Alibaba (Qwen Team) · 1,000,000 ctx

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling s...

Qwen: Qwen3 VL 235B A22B Instruct

235B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across image...

Qwen: Qwen3 VL 235B A22B Thinking

235B
by Alibaba (Qwen Team) · 131,072 ctx

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. ...

Qwen: Qwen3 VL 30B A3B Instruct

30B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its ...

Qwen: Qwen3 VL 30B A3B Thinking

30B
by Alibaba (Qwen Team) · 131,072 ctx

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its ...

Qwen: Qwen3 VL 32B Instruct

32B
by Alibaba (Qwen Team) · 262,144 ctx

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across te...

Qwen: Qwen3 VL 8B Instruct

8B
by Alibaba (Qwen Team) · 256,000 ctx

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning ...

Qwen: Qwen3 VL 8B Thinking

8B
by Alibaba (Qwen Team) · 256,000 ctx

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual rea...

Reka Edge

7B
by Rekaai · 16,384 ctx

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. ...

Seed-1.8

200B
by Bytedance · 256,000 ctx

Optimized specifically for multimodal agent scenarios. It features enhanced agent capabilities, upgraded multimodal comprehension, and mo...

Seed-2.0-code

multimodal
by Bytedance · 256,000 ctx

A coding model optimized for real-world development environments, with reliable tool use in common IDEs such as Claude Code. It delivers ...

Seed-2.0-pro

multimodal
by Bytedance · 256,000 ctx

Built for the Agent era, it delivers stable performance in complex reasoning and long-horizon tasks, including multi-step planning, visua...

Xiaomi: MiMo-V2.5

multimodal
by Xiaomi · 1,048,576 ctx

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference cost, while surp...

Xiaomi: MiMo-V2-Omni

multimodal
by Xiaomi · 262,144 ctx

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It comb...

Z.ai: GLM 4.5V

9B
by Zhipu AI · 65,536 ctx

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 1...

Z.ai: GLM 4.6V

multimodal
by Zhipu AI · 131,072 ctx

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents,...

Z.ai: GLM 5V Turbo

multimodal
by Zhipu AI · 202,752 ctx

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively ...

Closed / API-only models.

Direct API, aggregator (OpenRouter, Bedrock), or chat UI.

GPT-4o

multimodal
by OpenAI · GPT · 128,000 ctx

OpenAI's multimodal model — text, vision, audio in one.

GPT-4o Mini

multimodal
by OpenAI · GPT · 128,000 ctx

Cheap multimodal default — replaced GPT-3.5 Turbo for low-cost workloads.

Gemini 2.5 Pro

multimodal
by Google DeepMind · Gemini · 1,000,000 ctx

Google's frontier reasoning model with native 1M-token context.

Gemini 1.5 Pro

multimodal
by Google DeepMind · Gemini · 1,000,000 ctx

Google's pre-2.5 frontier — 2M context launched here.

Gemini 1.5 Flash

multimodal
by Google DeepMind · Gemini · 1,000,000 ctx

Cheap fast Gemini — production default before 2.0/2.5 Flash.

Grok 3

multimodal
by xAI · Grok · 1,000,000 ctx

xAI's frontier model with built-in DeepSearch + real-time X integration.

Amazon: Nova 2 Lite

multimodal
by Amazon · 1,000,000 ctx

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. ...

Amazon: Nova Lite 1.0

multimodal
by Amazon · 300,000 ctx

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to ...

Amazon: Nova Premier 1.0

multimodal
by Amazon · 1,000,000 ctx

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the best teacher for dis...

Amazon: Nova Pro 1.0

multimodal
by Amazon · 300,000 ctx

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide ...

Anthropic: Claude 3 Haiku

multimodal
by Anthropic · 200,000 ctx

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance. S...

Anthropic Claude Haiku Latest

multimodal
by Anthropic · 200,000 ctx

This model always redirects to the latest model in the Anthropic Claude Haiku family.

Anthropic: Claude Opus 4

multimodal
by Anthropic · 200,000 ctx

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-runnin...

Anthropic: Claude Opus 4.1

multimodal
by Anthropic · 200,000 ctx

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic task...

Anthropic: Claude Opus 4.5

multimodal
by Anthropic · 200,000 ctx

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon c...

Anthropic: Claude Opus 4.6

multimodal
by Anthropic · 1,000,000 ctx

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire...

Anthropic: Claude Opus 4.6 (Fast)

multimodal
by Anthropic · 1,000,000 ctx

Fast-mode variant of [Opus 4.6](/anthropic/claude-opus-4.6) - identical capabilities with higher output speed at premium 6x pricing. Lea...

Anthropic: Claude Opus 4.7 (Fast)

multimodal
by Anthropic · 1,000,000 ctx

Fast-mode variant of [Opus 4.7](/anthropic/claude-opus-4.7) - identical capabilities with higher output speed at premium 6x pricing. Lea...

Anthropic: Claude Opus Latest

multimodal
by Anthropic · 1,000,000 ctx

This model always redirects to the latest model in the Claude Opus family.

Anthropic: Claude Sonnet 4

multimodal
by Anthropic · 1,000,000 ctx

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with...

Anthropic: Claude Sonnet 4.5

multimodal
by Anthropic · 1,000,000 ctx

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers st...

Anthropic Claude Sonnet Latest

multimodal
by Anthropic · 1,000,000 ctx

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

Auto Router

multimodal
by Openrouter · 2,000,000 ctx

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output....

Free Models Router

multimodal
by Openrouter · 200,000 ctx

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenR...

Google: Gemini 2.0 Flash

multimodal
by Google DeepMind · 1,000,000 ctx

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while...

Google: Gemini 2.0 Flash Lite

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), ...

Google: Gemini 2.5 Flash

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and sci...

Google: Gemini 2.5 Flash Lite

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It ...

Google: Gemini 2.5 Flash Lite Preview 09-2025

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It ...

Google: Gemini 2.5 Pro Preview 05-06

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It emplo...

Google: Gemini 2.5 Pro Preview 06-05

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It emplo...

Google: Gemini 3.1 Flash Lite

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text,...

Google: Gemini 3.1 Flash Lite Preview

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite...

Google: Gemini 3.1 Pro Preview

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic relia...

Google: Gemini 3.1 Pro Preview Custom Tools

multimodal
by Google DeepMind · 1,048,756 ctx

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a gener...

Google: Gemini 3.5 Flash

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tier cost and speed....

Google: Gemini 3 Flash Preview

multimodal
by Google DeepMind · 1,048,576 ctx

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance....

Google Gemini Flash Latest

multimodal
by ~google · 1,048,576 ctx

This model always redirects to the latest model in the Google Gemini Flash family.

Google Gemini Pro Latest

multimodal
by ~google · 1,048,576 ctx

This model always redirects to the latest model in the Google Gemini Pro family.

Google: Gemma 3 12B

12B
by Google DeepMind · 131,072 ctx

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, unders...

Google: Gemma 4 26B A4B

26B
by Google DeepMind · 262,144 ctx

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B...

Google: Gemma 4 31B

31B
by Google DeepMind · 262,144 ctx

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K ...

Google: Lyria 3 Clip Preview

multimodal
by Google DeepMind · 1,048,576 ctx

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemin...

Google: Lyria 3 Pro Preview

multimodal
by Google DeepMind · 1,048,576 ctx

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. ...

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

multimodal
by Google DeepMind · 131,072 ctx

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, deliverin...

Google: Nano Banana (Gemini 2.5 Flash Image)

multimodal
by Google DeepMind · 32,768 ctx

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual...

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

multimodal
by Google DeepMind · 65,536 ctx

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana ...

OpenAI: GPT-4.1

multimodal
by OpenAI · 1,047,576 ctx

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-contex...

OpenAI: GPT-4.1 Mini

multimodal
by OpenAI · 1,047,576 ctx

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 ...

OpenAI: GPT-4.1 Nano

multimodal
by OpenAI · 1,047,576 ctx

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performa...

OpenAI: GPT-4o (2024-05-13)

multimodal
by OpenAI · 128,000 ctx

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligen...

OpenAI: GPT-4o (2024-08-06)

multimodal
by OpenAI · 128,000 ctx

The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respo...

OpenAI: GPT-4o (2024-11-20)

multimodal
by OpenAI · 128,000 ctx

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improv...

OpenAI: GPT-4o-mini (2024-07-18)

multimodal
by OpenAI · 128,000 ctx

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. ...

OpenAI: GPT-5.1

multimodal
by OpenAI · 400,000 ctx

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adheren...

OpenAI: GPT-5.1 Chat

multimodal
by OpenAI · 128,000 ctx

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong genera...

OpenAI: GPT-5.1-Codex

multimodal
by OpenAI · 400,000 ctx

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both intera...

OpenAI: GPT-5.1-Codex-Max

multimodal
by OpenAI · 400,000 ctx

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is base...

OpenAI: GPT-5.1-Codex-Mini

multimodal
by OpenAI · 400,000 ctx

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex

OpenAI: GPT-5.2

multimodal
by OpenAI · 400,000 ctx

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1...

OpenAI: GPT-5.2 Chat

multimodal
by OpenAI · 128,000 ctx

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong gener...

OpenAI: GPT-5.2-Codex

multimodal
by OpenAI · 400,000 ctx

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both in...

OpenAI: GPT-5.2 Pro

multimodal
by OpenAI · 400,000 ctx

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. I...

OpenAI: GPT-5.3 Chat

multimodal
by OpenAI · 128,000 ctx

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful...

OpenAI: GPT-5.3-Codex

multimodal
by OpenAI · 400,000 ctx

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex wi...

OpenAI: GPT-5.4

multimodal
by OpenAI · 1,050,000 ctx

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window ...

OpenAI: GPT-5.4 Image 2

multimodal
by OpenAI · 272,000 ctx

[GPT-5.4](https://openrouter.ai/openai/gpt-5.4) Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabiliti...

OpenAI: GPT-5.4 Mini

multimodal
by OpenAI · 400,000 ctx

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It suppor...

OpenAI: GPT-5.4 Nano

multimodal
by OpenAI · 400,000 ctx

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks...

OpenAI: GPT-5.4 Pro

multimodal
by OpenAI · 1,050,000 ctx

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex,...

OpenAI: GPT-5.5

multimodal
by OpenAI · 1,050,000 ctx

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reasoning, higher relia...

OpenAI: GPT-5.5 Pro

multimodal
by OpenAI · 1,050,000 ctx

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a ...

OpenAI: GPT-5 Chat

multimodal
by OpenAI · 128,000 ctx

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

OpenAI: GPT-5 Codex

multimodal
by OpenAI · 400,000 ctx

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactiv...

OpenAI: GPT-5 Image

multimodal
by OpenAI · 400,000 ctx

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It o...

OpenAI: GPT-5 Image Mini

multimodal
by OpenAI · 400,000 ctx

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with...

OpenAI: GPT-5 Mini

multimodal
by OpenAI · 400,000 ctx

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following a...

OpenAI: GPT-5 Nano

multimodal
by OpenAI · 400,000 ctx

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low late...

OpenAI: GPT-5 Pro

multimodal
by OpenAI · 400,000 ctx

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized f...

OpenAI: GPT Chat Latest

multimodal
by OpenAI · 400,000 ctx

GPT Chat Latest points to OpenAI's stable API alias `chat-latest` that always resolves to the latest Instant chat model used in ChatGPT. ...

OpenAI GPT Latest

multimodal
by OpenAI · 1,050,000 ctx

This model always redirects to the latest model in the OpenAI GPT family.

OpenAI GPT Mini Latest

multimodal
by OpenAI · 400,000 ctx

This model always redirects to the latest model in the OpenAI GPT Mini family.

OpenAI: o1

multimodal
by OpenAI · 200,000 ctx

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is t...

OpenAI: o1-pro

multimodal
by OpenAI · 200,000 ctx

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro mod...

OpenAI: o3

multimodal
by OpenAI · 200,000 ctx

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It ...

OpenAI: o3 Deep Research

multimodal
by OpenAI · 200,000 ctx

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model a...

OpenAI: o3 Pro

multimodal
by OpenAI · 200,000 ctx

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro mode...

OpenAI: o4 Mini

multimodal
by OpenAI · 200,000 ctx

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multim...

OpenAI: o4 Mini Deep Research

multimodal
by OpenAI · 200,000 ctx

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Not...

OpenAI: o4 Mini High

multimodal
by OpenAI · 200,000 ctx

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reason...

Perplexity: Sonar

multimodal
by Perplexity · 127,072 ctx

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed ...

Perplexity: Sonar Pro

multimodal
by Perplexity · 200,000 ctx

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing...

Perplexity: Sonar Pro Search

multimodal
by Perplexity · 200,000 ctx

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is d...

Perplexity: Sonar Reasoning Pro

multimodal
by Perplexity · 128,000 ctx

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing...

xAI: Grok 4.20

multimodal
by xAI · 2,000,000 ctx

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines the lowest halluci...

xAI: Grok 4.20 Multi-Agent

multimodal
by xAI · 2,000,000 ctx

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in paral...

xAI: Grok 4.3

multimodal
by xAI · 1,000,000 ctx

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic workflows, instructi...

xAI: Grok Build 0.1

multimodal
by xAI · 256,000 ctx

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports text and image inp...