First-party APIs
CA
Cohere.
Cohere's first-party API — Command series for chat + Embed series for retrieval.
Cheapest 12 models
Where the floor is.
Sorted cheapest-first by $/M input. Useful when you're looking for the floor before picking a model.
Loading...
At a glance
- Service type
- First-party APIs
- Trust tier
- Tier 1
- Headquarters
- CA
- OpenAI-compat
- No
- Open weights
- Yes
- Proprietary
- Yes
When to pick Cohere
Best for
- Full feature coverage — prompt caching, batch tier, function calling, fine-tuning.
- The lowest per-token rate for the maker's own models.
- Production workloads where a direct billing relationship matters.
Avoid for
- Multi-model workflows that need a unified billing surface.
- Anywhere the maker's own SLA isn't sufficient.
Models on Cohere
Pricing + measured speed + self-host alternative, one row per model. Click a column header to sort.
| Model ↕ | Maker ↕ | Access ↕ | $/M in ↕ | $/M out ↕ | Tokens/sec ↕ | TTFT ↕ | Self-host on ↕ | |
|---|---|---|---|---|---|---|---|---|
| Cohere: Command A | Cohere | api direct | $2.5 | $10.0 | — | — | API only | Open → |
| Cohere: Command R7B (12-2024) | Cohere | api direct | $0.0375 | $0.15 | — | — | API only | Open → |
| Cohere: Command R (08-2024) | Cohere | api direct | $0.15 | $0.6 | — | — | API only | Open → |
| Cohere: Command R+ (08-2024) | Cohere | api direct | $2.5 | $10.0 | — | — | API only | Open → |
| Command R+ | Cohere | api direct | $2.5 | $10.0 | — | — | 1× Nvidia H100 · INT4 | Open → |