by Alibaba (Qwen Team)

Qwen: Qwen3.5 397B A17B.

multimodal open weights datacenter 397B params 262K ctx
Cheapest input
$0.39/M
on Alibaba DashScope
Cheapest output
$2.34/M
on Alibaba DashScope
Fastest
86 tok/s
on OpenRouter
Smallest GPU
1× AMD MI325

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. It d

Smallest GPU to run it See all quantisations →

1× AMD MI325.

Most-aggressive quantisation we have a working recommendation for. Lower precision = less VRAM = cheaper hardware, at a small accuracy cost.

Where to use it

Cheapest hosted endpoints.

Provider Access $/M in $/M out
Alibaba DashScope api direct $0.39 $2.34 Launch ↗
OpenRouter api aggregator $0.39 $2.34 Launch ↗
DeepInfra hosted inference $0.49 $3.6 Launch ↗
Together AI hosted inference $0.6 $3.6 Launch ↗
Performance

Speed across providers.

Tokens/sec and time-to-first-token measured against the same prompt template on each provider's API.

Provider Tokens/sec TTFT Total
OpenRouter 86.1 5809 ms
FAQ

Frequently asked.

How do I run Qwen: Qwen3.5 397B A17B?
Qwen: Qwen3.5 397B A17B is open-weight, so you can self-host on rented GPUs. See the Run It Yourself tab for GPU configurations + cost estimates, or use one of the hosted inference providers listed on this page.
Where can I access Qwen: Qwen3.5 397B A17B?
Qwen: Qwen3.5 397B A17B is available via Alibaba DashScope, Together AI, DeepInfra, OpenRouter. Each access option lists its own pricing (per million tokens or hourly hosting).
How much does it cost to run Qwen: Qwen3.5 397B A17B?
API pricing starts at $0.39/M input tokens and $2.34/M output tokens. Self-hosting cost depends on the GPU you rent — see the Run It Yourself tab.
Is Qwen: Qwen3.5 397B A17B open-source or proprietary?
Qwen: Qwen3.5 397B A17B is open-weight under the license. You can download and self-host it.