by Nvidia
NVIDIA: Nemotron 3 Ultra.
text
closed
1M ctx
Cheapest input
$0.5/M
on OpenRouter
Cheapest output
$2.5/M
on OpenRouter
Hosted equiv.
~$0.9/hr
@ 100 tok/s on OpenRouter
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...
Hosted API only
No self-host path — closed weights.
NVIDIA: Nemotron 3 Ultra's weights aren't published. Use it via the access providers below.
Where to use it
Cheapest hosted endpoints.
FAQ
Frequently asked.
How do I run NVIDIA: Nemotron 3 Ultra?
NVIDIA: Nemotron 3 Ultra is a closed-source API model. The cheapest way to access it is through the API providers listed on this page (direct API, aggregators, and hosted chat UIs).
Where can I access NVIDIA: Nemotron 3 Ultra?
NVIDIA: Nemotron 3 Ultra is available via Together AI, OpenRouter. Each access option lists its own pricing (per million tokens or hourly hosting).
How much does it cost to run NVIDIA: Nemotron 3 Ultra?
API pricing starts at $0.5/M input tokens and $2.5/M output tokens. Self-hosting cost depends on the GPU you rent — see the Run It Yourself tab.
Is NVIDIA: Nemotron 3 Ultra open-source or proprietary?
NVIDIA: Nemotron 3 Ultra is a proprietary model from Nvidia. Access is API-only — there are no public weights to download.
API pricing
Per provider
What it costs per month across providers.
Estimate your monthly bill for NVIDIA: Nemotron 3 Ultra across every host that publishes per-token pricing. Slide your token volumes; the chart + table re-rank cheapest-first.
Cheapest
$10.0
OpenRouter
Most expensive
$13.2
Together AI
Spread
$3.2
max − min
Providers
2
with priced rows
Monthly bill
Cheapest provider on the left.
Total monthly cost — input + output tokens combined.
Loading...
Bill breakdown.
Full calculator
Want to compare token volumes across other models too?
Open the standalone API pricing tool →
Context window
How much it can remember.
1M tokens
≈ 750,000 English words
4K
32K
128K
1M
Capabilities
What it can do.
·
Vision input
·
Audio input
·
Video input
·
Function calling
·
Tool use
·
JSON mode
✓
Streaming
·
Fine-tuning