API aggregators
CN
SiliconFlow.
China-based aggregator with strong coverage of Chinese open-weight models (GLM, Qwen, DeepSeek, MiniMax).
At a glance
- Service type
- API aggregators
- Trust tier
- Tier 2
- Headquarters
- CN
- OpenAI-compat
- No
- Open weights
- Yes
- Proprietary
- No
When to pick SiliconFlow
Best for
- Building once and swapping models freely — same key, same endpoint shape.
- Workloads that benefit from automatic failover across upstreams.
- Anyone who wants per-token billing without managing N separate accounts.
Avoid for
- Workloads needing the absolute lowest per-token price (first-party usually wins).
- Anything requiring real-time price quotes from the original maker.
Models on SiliconFlow
Pricing + measured speed + self-host alternative, one row per model. Click a column header to sort.
| Model ↕ | Maker ↕ | Access ↕ | $/M in ↕ | $/M out ↕ | Tokens/sec ↕ | TTFT ↕ | Self-host on ↕ | |
|---|---|---|---|---|---|---|---|---|
| Baichuan2-13B | Baichuan Inc. | hosted inference | — | — | — | — | 1× Nvidia RTX 3080 · INT4 | Open → |
| Baichuan2-13B | Baichuan Inc. | hosted inference | — | — | — | — | 1× Nvidia RTX 3080 · INT4 | Open → |
| InternLM 2.5 20B | Shanghai AI Lab | hosted inference | — | — | — | — | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| InternLM 2.5 20B | Shanghai AI Lab | hosted inference | — | — | — | — | 1× Nvidia GeForce RTX 4080 · INT4 | Open → |
| GLM-4.5 | Zhipu AI | hosted inference | — | — | — | — | 1× AMD MI325 · INT4 | Open → |
| GLM-4.5 | Zhipu AI | hosted inference | — | — | — | — | 1× AMD MI325 · INT4 | Open → |
| Kimi K2.6 | Moonshot AI | hosted inference | — | — | — | — | 4× AMD MI300 · INT4 | Open → |