Tool · Run a model

Will my model fit on this GPU?

Compute the VRAM you need for a given AI model at a chosen precision and context window, and see exactly which rentable GPUs have the headroom — plus the cheapest provider for each.

Llama 3.3 70B
1,024 1,000,000
1 64
Active params
70.0B
Dense
Bytes / param
2.0
FP16
Context
128,000
tokens
VRAM required
179 GB
weights + KV + 20% headroom

GPUs that fit

Sorted by VRAM ascending — smallest fitting card first (usually the cheapest).

GPU VRAM FP16 TFLOPS TDP Cheapest provider $/hr
192GB 1000W DeepInfra $2.79/hr Open →
192GB TensorWave $1.95/hr Open →
192GB 750W no live offers Open →
192GB no live offers Open →
256GB 1000W no live offers Open →
288GB 1400W no live offers Open →
288GB 1400W io.net $5.38/hr Open →
288GB no live offers Open →
288GB 2700W no live offers Open →

Related tools