by Zhipu AI
Z.ai: GLM 5.2.
text
closed
1M ctx
Cheapest input
$1.4/M
on DeepInfra
Cheapest output
$4.4/M
on DeepInfra
Hosted equiv.
~$1.58/hr
@ 100 tok/s on DeepInfra
GLM-5.2 is Z.ai’s flagship model for the era of long-horizon tasks. With a truly usable 1M-token context window, it can handle project-level engineering context, execute long-running tasks more reliably, follow...
Hosted API only
No self-host path — closed weights.
Z.ai: GLM 5.2's weights aren't published. Use it via the access providers below.
Where to use it
Cheapest hosted endpoints.
FAQ
Frequently asked.
How do I run Z.ai: GLM 5.2?
Z.ai: GLM 5.2 is a closed-source API model. The cheapest way to access it is through the API providers listed on this page (direct API, aggregators, and hosted chat UIs).
Where can I access Z.ai: GLM 5.2?
Z.ai: GLM 5.2 is available via DeepInfra, OpenRouter, z.ai. Each access option lists its own pricing (per million tokens or hourly hosting).
How much does it cost to run Z.ai: GLM 5.2?
API pricing starts at $1.4/M input tokens and $4.4/M output tokens. Self-hosting cost depends on the GPU you rent — see the Run It Yourself tab.
Is Z.ai: GLM 5.2 open-source or proprietary?
Z.ai: GLM 5.2 is a proprietary model from Zhipu AI. Access is API-only — there are no public weights to download.
API pricing
Per provider
What it costs per month across providers.
Estimate your monthly bill for Z.ai: GLM 5.2 across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.
Cheapest
$22.8
OpenRouter
$/M input
$1.4
per million tokens
$/M output
$4.4
per million tokens
Providers
2
with priced rows
Monthly bill
Cheapest provider on the left.
Total monthly cost — input + output tokens combined.
Loading...
Bill breakdown.
Full calculator
Want to compare token volumes across other models too?
Open the standalone API pricing tool →
Context window
How much it can remember.
1M tokens
≈ 786,432 English words
4K
32K
128K
1M
Capabilities
What it can do.
·
Vision input
·
Audio input
·
Video input
·
Function calling
·
Tool use
·
JSON mode
✓
Streaming
·
Fine-tuning