First-party APIs CA

Cohere.

Cohere's first-party API — Command series for chat + Embed series for retrieval.

Cheapest 12 models

Where the floor is.

Sorted cheapest-first by $/M input. Useful when you're looking for the floor before picking a model.

Loading...

At a glance

Service type
First-party APIs
Trust tier
Tier 1
Headquarters
CA
OpenAI-compat
No
Open weights
Yes
Proprietary
Yes

When to pick Cohere

Best for

  • Full feature coverage — prompt caching, batch tier, function calling, fine-tuning.
  • The lowest per-token rate for the maker's own models.
  • Production workloads where a direct billing relationship matters.

Avoid for

  • Multi-model workflows that need a unified billing surface.
  • Anywhere the maker's own SLA isn't sufficient.

Models on Cohere

Pricing + measured speed + self-host alternative, one row per model. Click a column header to sort.

5 models · 0 benchmarked
Model ↕ Maker ↕ Access ↕ $/M in ↕ $/M out ↕ Tokens/sec ↕ TTFT ↕ Self-host on ↕
Cohere: Command A Cohere api direct $2.5 $10.0 API only Open →
Cohere: Command R7B (12-2024) Cohere api direct $0.0375 $0.15 API only Open →
Cohere: Command R (08-2024) Cohere api direct $0.15 $0.6 API only Open →
Cohere: Command R+ (08-2024) Cohere api direct $2.5 $10.0 API only Open →
Command R+ Cohere api direct $2.5 $10.0 1× Nvidia H100 · INT4 Open →