Qwen API Pricing
The Qwen (Tongyi Qianwen) model family from Alibaba Cloud's Tongyi Lab. First released in 2023, Qwen is the most-downloaded open-weight model family in the world — most tiers ship under Apache 2.0 on Hugging Face and ModelScope, while the proprietary Max tier is served through Alibaba Cloud's Model Studio.
QwQ 32B
Open-weight Qwen reasoning model — 32B params, Apache 2.0.
Qwen3 235B A22B
Open-weight Qwen3 frontier — 235B total params, 22B active per token (MoE), Apache 2.0.
Qwen3 32B
Open-weight dense 32B Qwen3 - Apache 2.0.
Qwen3 14B
Open-weight dense 14B Qwen3 - Apache 2.0.
Qwen 3.5 397B A17B
Open-weight Qwen 3.5 flagship MoE - 397B total parameters, 17B active per token.
Qwen 3.5 122B A10B
Open-weight Qwen 3.5 mid-tier MoE - 122B total parameters, 10B active per token.
Qwen 3.5 Plus
International deployment (Singapore).
Qwen 3.5 Flash
International deployment (Singapore).
Qwen3 Max
Listed prices are for International deployment (Singapore region) at ≤32K input tokens.
Qwen3 VL Plus
Vision-language sibling of Qwen 3.5 Plus on Model Studio.
Qwen3 VL Flash
Cheapest tier of the Qwen3 VL family on Model Studio.
Qwen3 Coder Plus
Code-specialised flagship for the Qwen3 line.
Qwen3 Coder Flash
Cost-efficient Qwen3 coder.
QwQ Plus
Proprietary reasoning model in the QwQ family on Model Studio.
Qwen Turbo LAUNCHED APR 2025
Deprecated text model.
Qwen 2.5 72B Instruct LAUNCHED SEP 2024
Open-weight Qwen 2.5 flagship dense model — 72.7B params, Qwen Research License (non-Apache because of size).
Qwen 2.5 VL 72B Instruct LAUNCHED JAN 2025
Vision-language flagship of the Qwen 2.5 era — 72B params, strong on document and chart understanding, agentic UI.
Qwen 2.5 32B Instruct LAUNCHED SEP 2024
Open-weight Qwen 2.5 mid-tier — 32B dense, Apache 2.0.
Qwen 2.5 Coder 32B Instruct LAUNCHED NOV 2024
Code-specialised Qwen 2.5 — 32B dense, Apache 2.0.
Qwen 2.5 14B Instruct LAUNCHED SEP 2024
Open-weight Qwen 2.5 14B dense — Apache 2.0.
Qwen 2.5 7B Instruct LAUNCHED SEP 2024
Open-weight Qwen 2.5 7B dense — Apache 2.0.
Qwen Max (2.5) LAUNCHED JAN 2025
Legacy proprietary Qwen 2.5 Max row.
Qwen Plus (2.5) LAUNCHED DEC 2025
Legacy proprietary Qwen Plus tier.
Qwen Flash (2.5) LAUNCHED JUL 2025
Legacy proprietary Qwen Flash tier.
Qwen VL Max LAUNCHED JAN 2024
Original Qwen vision-language flagship.
Qwen VL Plus LAUNCHED JAN 2024
Original cost-efficient Qwen vision-language model.
| Model | Input /M | Output /M | Cached | Context | Max output | Vision | Tools | Tier |
|---|---|---|---|---|---|---|---|---|
| QwQ 32B | $0.29 | $0.86 | — | 131K | — | ✗ | ✓ | Light |
| Qwen3 235B A22B | $0.7 | $2.80 | — | 131K | — | ✗ | ✓ | Frontier |
| Qwen3 32B | $0.16 | $0.64 | — | 131K | — | ✗ | ✓ | Mid |
| Qwen3 14B | $0.35 | $1.40 | — | 131K | — | ✗ | ✓ | Light |
| Qwen 3.5 397B A17B | $0.6 | $3.60 | — | 256K | — | ✗ | ✓ | Frontier |
| Qwen 3.5 122B A10B | $0.4 | $3.20 | — | 256K | — | ✗ | ✓ | Mid |
| Qwen 3.5 Plus | $0.4 | $2.40 | — | 256K | — | ✓ | ✓ | Mid |
| Qwen 3.5 Flash | $0.1 | $0.4 | — | 1M | — | ✓ | ✓ | Light |
| Qwen3 Max FLAGSHIP | $1.20 | $6.00 | — | 252K | — | ✓ | ✓ | Frontier |
| Qwen3 VL Plus | $0.2 | $1.60 | — | 256K | — | ✓ | ✓ | Mid |
| Qwen3 VL Flash | $0.05 | $0.4 | — | 256K | — | ✓ | ✓ | Light |
| Qwen3 Coder Plus | $1.00 | $5.00 | — | 1M | — | ✗ | ✓ | Mid |
| Qwen3 Coder Flash | $0.3 | $1.50 | — | 1M | — | ✗ | ✓ | Light |
| QwQ Plus | $0.8 | $2.40 | — | 131K | — | ✗ | ✓ | Mid |
| Qwen Turbo DEPRECATED | $0.05 | $0.2 | — | 1M | — | ✗ | ✓ | Deprecated |
| Qwen 2.5 72B Instruct LEGACY | $1.40 | $5.60 | — | 131K | — | ✗ | ✓ | Legacy |
| Qwen 2.5 VL 72B Instruct LEGACY | $2.80 | $8.40 | — | 131K | — | ✓ | ✓ | Legacy |
| Qwen 2.5 32B Instruct LEGACY | $0.7 | $2.80 | — | 131K | — | ✗ | ✓ | Legacy |
| Qwen 2.5 Coder 32B Instruct LEGACY | $0.29 | $0.86 | — | 131K | — | ✗ | ✓ | Legacy |
| Qwen 2.5 14B Instruct LEGACY | $0.35 | $1.40 | — | 131K | — | ✗ | ✓ | Legacy |
| Qwen 2.5 7B Instruct LEGACY | $0.17 | $0.7 | — | 131K | — | ✗ | ✓ | Legacy |
| Qwen Max (2.5) LEGACY | $1.60 | $6.40 | — | 32K | — | ✗ | ✓ | Legacy |
| Qwen Plus (2.5) LEGACY | $0.4 | $1.20 | — | 1M | — | ✗ | ✓ | Legacy |
| Qwen Flash (2.5) LEGACY | $0.05 | $0.4 | — | 1M | — | ✗ | ✓ | Legacy |
| Qwen VL Max LEGACY | $0.8 | $3.20 | — | 128K | — | ✓ | ✓ | Legacy |
| Qwen VL Plus LEGACY | $0.21 | $0.63 | — | 128K | — | ✓ | ✓ | Legacy |
The Qwen story so far.
Major releases and the open-weight cadence from Alibaba's Tongyi Lab. Sourced from Qwen blog/changelog, Alibaba Cloud Model Studio, and Wikipedia.
Model Studio (Bailian / DashScope) — managed API for the full Qwen catalog, including proprietary tiers. The International (Singapore) endpoint serves non-China traffic; pricing is tiered by input length.
OPEN-WEIGHT · SELF-HOSTMost Qwen tiers ship Apache 2.0 open weights on Hugging Face and ModelScope — fully self-hostable. The most-downloaded open model family, with 300+ models released.
CONSUMER · WEBchat.qwen.ai — free consumer assistant covering chat, vision, and coding. For end-user usage; no API access on this surface.
CHINA · DOMESTICThe China-region Model Studio endpoint keeps data inside mainland China for domestic compliance — the other half of Qwen's dual-deployment model alongside Singapore.
Qwen — branded Tongyi Qianwen (通义千问) — is the large-model family from Alibaba Cloud's Tongyi Lab, which was established in 2022; the Qwen series was first released in 2023 and open-sourcing began that year. It is developed in Hangzhou under Alibaba Group, a public company (NYSE: BABA, HKEX: 9988).
Qwen's defining strategy is open weights at scale. The lab has released 300+ models spanning text, coding, vision, speech, and video, mostly under the permissive Apache 2.0 license on Hugging Face and ModelScope. By March 2026 the family had been downloaded over 940 million times and spawned 200,000+ derivative models — making Qwen the most-downloaded open-weight model family in the world.
Technically, Qwen3 (April 2025) was trained on roughly 36 trillion tokens across 119 languages and dialects, with both dense and mixture-of-experts (MoE) variants from 0.6B up to 397B parameters. The 2026 Qwen 3.5 generation continued the open MoE line (397B-A17B, 122B-A10B) alongside managed Plus/Flash tiers, while the top Qwen3 Max flagship is served as a proprietary, cloud-only model.
Distribution is dual-track: managed inference through Alibaba Cloud Model Studio (with separate International/Singapore and China-domestic endpoints), and self-hosting via open weights. Model Studio pricing is tiered by input length — the headline rates here are the 0–32K range on the Singapore region, and costs scale up for longer prompts.
The trade-offs: most models are text-or-vision with strong multilingual coverage, but the company is China-based, so the direct API stores data per its region (China or Singapore), and US export-control / procurement questions apply. Versus DeepSeek, Qwen offers a far broader lineup and multimodal range; versus OpenAI and Anthropic, it trades some frontier-benchmark lead for radically more open weights and lower cost.
DeepSeek
Lowest cost-per-token at frontier reasoning quality with MIT open weights. But text-only and a far narrower lineup than Qwen's full-modality catalog.
PRIMARY RIVALOpenAI
Frontier benchmarks + ecosystem (GPT-5, ChatGPT, Azure). But closed weights, US data residency, and multiples higher cost than Qwen's open tiers.
FRONTIER LEADERAnthropic
Leads coding + reasoning with Claude and zero-retention defaults. But no open weights and no low-cost / multilingual story to match Qwen.
Meta
Llama open weights with a Western governance posture. But fewer model sizes and modalities than Qwen, and no managed low-cost API.
Mistral
EU-native open weights (Apache 2.0) with GDPR-first residency — the European answer to Qwen's openness, but a smaller catalog and no China-domestic option.
Frequently asked.
Practical questions about Qwen pricing, open weights, and deployment.
Q · 01 Which Qwen model should I start with? +
$0.40/$2.40) as a multimodal daily driver, or Qwen3 Max ($1.20/$6.00) for the proprietary frontier. Want to self-host? Use the open-weight Qwen 3.5 397B. Drop to Qwen 3.5 Flash ($0.10/$0.40) for cheap 1M-context volume. See the use case picker.Q · 02 Are Qwen models really open-weight? +
Q · 03 Why are the prices here lower than what I'm billed? +
0–32K input range on the International (Singapore) region; longer prompts cost more (for example Qwen3 Coder Plus climbs from $1/$5 to $6/$60 at the 256K–1M tier). Batch API offers a 50% discount.Q · 04 Where is my data stored? +
Q · 05 How many languages does Qwen support? +
Q · 06 What's the difference between Qwen3 Max and the open Qwen 3.5 models? +
$1.20/$6.00). The Qwen 3.5 open MoE models (397B-A17B, 122B-A10B) are Apache-2.0 and self-hostable, at lower per-token cost. Choose Max for the managed top tier; choose 3.5 open for control and self-hosting.