Last verified
OPEN-WEIGHT GIANTQWEN SERIES 2023ALIBABA CLOUDAPACHE 2.0119 LANGUAGES

Qwen API Pricing

The Qwen (Tongyi Qianwen) model family from Alibaba Cloud's Tongyi Lab. First released in 2023, Qwen is the most-downloaded open-weight model family in the world — most tiers ship under Apache 2.0 on Hugging Face and ModelScope, while the proprietary Max tier is served through Alibaba Cloud's Model Studio.

Production models
14
+ 300+ open weights
Qwen series
2023
Tongyi Lab
Cheapest tier
$0.05/M
Qwen3 VL Flash input
Open weights
Apache 2.0
HF + ModelScope
Context window
1M
Qwen 3.5 Flash / Coder
Languages
119
trained on 36T tokens
§ 01 / LINEUP

The full roster.

Side-by-side →
FAST · LIGHTWEIGHT

QwQ 32B

Open-weight Qwen reasoning model — 32B params, Apache 2.0.

Input
$0.29/M
Output
$0.86/M
131K ctx · text-only
FRONTIER · REASONING

Qwen3 235B A22B

Open-weight Qwen3 frontier — 235B total params, 22B active per token (MoE), Apache 2.0.

Input
$0.70/M
Output
$3/M
131K ctx · text-only
BALANCED · MID-TIER

Qwen3 32B

Open-weight dense 32B Qwen3 - Apache 2.0.

Input
$0.16/M
Output
$0.64/M
131K ctx · text-only
FAST · LIGHTWEIGHT

Qwen3 14B

Open-weight dense 14B Qwen3 - Apache 2.0.

Input
$0.35/M
Output
$1/M
131K ctx · text-only
FRONTIER · REASONING

Qwen 3.5 397B A17B

Open-weight Qwen 3.5 flagship MoE - 397B total parameters, 17B active per token.

Input
$0.60/M
Output
$4/M
256K ctx · text-only
BALANCED · MID-TIER

Qwen 3.5 122B A10B

Open-weight Qwen 3.5 mid-tier MoE - 122B total parameters, 10B active per token.

Input
$0.40/M
Output
$3/M
256K ctx · text-only
BALANCED · MID-TIER

Qwen 3.5 Plus

International deployment (Singapore).

Input
$0.40/M
Output
$2/M
256K ctx · vision
FAST · LIGHTWEIGHT

Qwen 3.5 Flash

International deployment (Singapore).

Input
$0.10/M
Output
$0.40/M
1M ctx · vision
FRONTIER · REASONING

Qwen3 Max

Listed prices are for International deployment (Singapore region) at ≤32K input tokens.

Input
$1/M
Output
$6/M
252K ctx · vision
BALANCED · MID-TIER

Qwen3 VL Plus

Vision-language sibling of Qwen 3.5 Plus on Model Studio.

Input
$0.20/M
Output
$2/M
256K ctx · vision
FAST · LIGHTWEIGHT

Qwen3 VL Flash

Cheapest tier of the Qwen3 VL family on Model Studio.

Input
$0.05/M
Output
$0.40/M
256K ctx · vision
BALANCED · MID-TIER

Qwen3 Coder Plus

Code-specialised flagship for the Qwen3 line.

Input
$1/M
Output
$5/M
1M ctx · text-only
FAST · LIGHTWEIGHT

Qwen3 Coder Flash

Cost-efficient Qwen3 coder.

Input
$0.30/M
Output
$2/M
1M ctx · text-only
BALANCED · MID-TIER

QwQ Plus

Proprietary reasoning model in the QwQ family on Model Studio.

Input
$0.80/M
Output
$2/M
131K ctx · text-only
DEPRECATED · RETIRING

Qwen Turbo LAUNCHED APR 2025

Deprecated text model.

Input
$0.05/M
Output
$0.20/M
1M ctx · text-only
LEGACY · STILL SUPPORTED

Qwen 2.5 72B Instruct LAUNCHED SEP 2024

Open-weight Qwen 2.5 flagship dense model — 72.7B params, Qwen Research License (non-Apache because of size).

Input
$1/M
Output
$6/M
131K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen 2.5 VL 72B Instruct LAUNCHED JAN 2025

Vision-language flagship of the Qwen 2.5 era — 72B params, strong on document and chart understanding, agentic UI.

Input
$3/M
Output
$8/M
131K ctx · vision
LEGACY · STILL SUPPORTED

Qwen 2.5 32B Instruct LAUNCHED SEP 2024

Open-weight Qwen 2.5 mid-tier — 32B dense, Apache 2.0.

Input
$0.70/M
Output
$3/M
131K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen 2.5 Coder 32B Instruct LAUNCHED NOV 2024

Code-specialised Qwen 2.5 — 32B dense, Apache 2.0.

Input
$0.29/M
Output
$0.86/M
131K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen 2.5 14B Instruct LAUNCHED SEP 2024

Open-weight Qwen 2.5 14B dense — Apache 2.0.

Input
$0.35/M
Output
$1/M
131K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen 2.5 7B Instruct LAUNCHED SEP 2024

Open-weight Qwen 2.5 7B dense — Apache 2.0.

Input
$0.17/M
Output
$0.70/M
131K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen Max (2.5) LAUNCHED JAN 2025

Legacy proprietary Qwen 2.5 Max row.

Input
$2/M
Output
$6/M
32K ctx · text-only
LEGACY · STILL SUPPORTED

Qwen Plus (2.5) LAUNCHED DEC 2025

Legacy proprietary Qwen Plus tier.

Input
$0.40/M
Output
$1/M
1M ctx · text-only
LEGACY · STILL SUPPORTED

Qwen Flash (2.5) LAUNCHED JUL 2025

Legacy proprietary Qwen Flash tier.

Input
$0.05/M
Output
$0.40/M
1M ctx · text-only
LEGACY · STILL SUPPORTED

Qwen VL Max LAUNCHED JAN 2024

Original Qwen vision-language flagship.

Input
$0.80/M
Output
$3/M
128K ctx · vision
LEGACY · STILL SUPPORTED

Qwen VL Plus LAUNCHED JAN 2024

Original cost-efficient Qwen vision-language model.

Input
$0.21/M
Output
$0.63/M
128K ctx · vision
§ 02 / SHELF

All side-by-side.

Methodology →
Model Input /M Output /M Cached Context Max output Vision Tools Tier
QwQ 32B $0.29 $0.86 131K Light
Qwen3 235B A22B $0.7 $2.80 131K Frontier
Qwen3 32B $0.16 $0.64 131K Mid
Qwen3 14B $0.35 $1.40 131K Light
Qwen 3.5 397B A17B $0.6 $3.60 256K Frontier
Qwen 3.5 122B A10B $0.4 $3.20 256K Mid
Qwen 3.5 Plus $0.4 $2.40 256K Mid
Qwen 3.5 Flash $0.1 $0.4 1M Light
Qwen3 Max FLAGSHIP $1.20 $6.00 252K Frontier
Qwen3 VL Plus $0.2 $1.60 256K Mid
Qwen3 VL Flash $0.05 $0.4 256K Light
Qwen3 Coder Plus $1.00 $5.00 1M Mid
Qwen3 Coder Flash $0.3 $1.50 1M Light
QwQ Plus $0.8 $2.40 131K Mid
Qwen Turbo DEPRECATED $0.05 $0.2 1M Deprecated
Qwen 2.5 72B Instruct LEGACY $1.40 $5.60 131K Legacy
Qwen 2.5 VL 72B Instruct LEGACY $2.80 $8.40 131K Legacy
Qwen 2.5 32B Instruct LEGACY $0.7 $2.80 131K Legacy
Qwen 2.5 Coder 32B Instruct LEGACY $0.29 $0.86 131K Legacy
Qwen 2.5 14B Instruct LEGACY $0.35 $1.40 131K Legacy
Qwen 2.5 7B Instruct LEGACY $0.17 $0.7 131K Legacy
Qwen Max (2.5) LEGACY $1.60 $6.40 32K Legacy
Qwen Plus (2.5) LEGACY $0.4 $1.20 1M Legacy
Qwen Flash (2.5) LEGACY $0.05 $0.4 1M Legacy
Qwen VL Max LEGACY $0.8 $3.20 128K Legacy
Qwen VL Plus LEGACY $0.21 $0.63 128K Legacy

The Qwen story so far.

Major releases and the open-weight cadence from Alibaba's Tongyi Lab. Sourced from Qwen blog/changelog, Alibaba Cloud Model Studio, and Wikipedia.

APR · 2026
Qwen3.5-Omni + Qwen3.6-Plus released — newest tiers shipped as proprietary (cloud-only)
RELEASE
FEB 16 · 2026
Qwen 3.5 family launched — Plus, Flash, and open-weight 397B/122B MoE models
RELEASE
JAN 23 · 2026
Qwen3 Max released — proprietary multimodal flagship at $1.20/$6.00, 252K context
RELEASE
JAN · 2026
Qwen passes 200,000 derivative models — first open model family to hit the milestone
CORPORATE
APR · 2025
Qwen3 released — 8 sizes, trained on 36T tokens across 119 languages, Apache 2.0 open weights
RELEASE
MAR 06 · 2025
QwQ 32B full release — open-weight reasoning model with strong math/code scores for its size
RELEASE
2023
Qwen (Tongyi Qianwen) series launched by Alibaba Cloud's Tongyi Lab; open-sourcing begins
CORPORATE
§ 04 / ACCESS

Where to get it.

Methodology →
§ 04 / BEST FOR

Which Alibaba (Qwen) for what.

More scenarios →
If you need the proprietary multilingual flagship
Qwen3 Max
Profile →
If you want an open-weight frontier MoE to self-host…
Qwen 3.5 397B
Profile →
If you need transparent reasoning
QwQ Plus
Profile →
If you're building agentic coding tools…
Qwen3 Coder Plus
Profile →
If you process documents, charts, or UI images…
Qwen3 VL Plus
Profile →
If you want cheap high-volume 1M context
Qwen 3.5 Flash
Profile →
If you need China data residency
Model Studio (China)
Alibaba Cloud →
If you want a small open reasoning model to run yourself…
QwQ 32B
Profile →
§ 06 / BACKGROUND

The company behind it.

qwen.ai →

Qwen — branded Tongyi Qianwen (通义千问) — is the large-model family from Alibaba Cloud's Tongyi Lab, which was established in 2022; the Qwen series was first released in 2023 and open-sourcing began that year. It is developed in Hangzhou under Alibaba Group, a public company (NYSE: BABA, HKEX: 9988).

Qwen's defining strategy is open weights at scale. The lab has released 300+ models spanning text, coding, vision, speech, and video, mostly under the permissive Apache 2.0 license on Hugging Face and ModelScope. By March 2026 the family had been downloaded over 940 million times and spawned 200,000+ derivative models — making Qwen the most-downloaded open-weight model family in the world.

Technically, Qwen3 (April 2025) was trained on roughly 36 trillion tokens across 119 languages and dialects, with both dense and mixture-of-experts (MoE) variants from 0.6B up to 397B parameters. The 2026 Qwen 3.5 generation continued the open MoE line (397B-A17B, 122B-A10B) alongside managed Plus/Flash tiers, while the top Qwen3 Max flagship is served as a proprietary, cloud-only model.

Distribution is dual-track: managed inference through Alibaba Cloud Model Studio (with separate International/Singapore and China-domestic endpoints), and self-hosting via open weights. Model Studio pricing is tiered by input length — the headline rates here are the 0–32K range on the Singapore region, and costs scale up for longer prompts.

The trade-offs: most models are text-or-vision with strong multilingual coverage, but the company is China-based, so the direct API stores data per its region (China or Singapore), and US export-control / procurement questions apply. Versus DeepSeek, Qwen offers a far broader lineup and multimodal range; versus OpenAI and Anthropic, it trades some frontier-benchmark lead for radically more open weights and lower cost.

§ 07 / COMPETITORS

Other frontier labs.

All providers →

Frequently asked.

Practical questions about Qwen pricing, open weights, and deployment.

Q · 01 Which Qwen model should I start with? +
For most teams: Qwen 3.5 Plus ($0.40/$2.40) as a multimodal daily driver, or Qwen3 Max ($1.20/$6.00) for the proprietary frontier. Want to self-host? Use the open-weight Qwen 3.5 397B. Drop to Qwen 3.5 Flash ($0.10/$0.40) for cheap 1M-context volume. See the use case picker.
Q · 02 Are Qwen models really open-weight? +
Mostly yes. The dense and MoE Qwen3 / Qwen 3.5 models (e.g. 397B-A17B, 32B, QwQ 32B) ship under Apache 2.0 on Hugging Face and ModelScope — fully self-hostable and commercially usable. The top Qwen3 Max tier and some newest releases are proprietary, cloud-only. Always check each model's license.
Q · 03 Why are the prices here lower than what I'm billed? +
Model Studio uses tiered pricing by input length. The rates shown are the 0–32K input range on the International (Singapore) region; longer prompts cost more (for example Qwen3 Coder Plus climbs from $1/$5 to $6/$60 at the 256K–1M tier). Batch API offers a 50% discount.
Q · 04 Where is my data stored? +
It depends on the endpoint. The International (Singapore) Model Studio region keeps non-China traffic in Singapore; the China-domestic region keeps data in mainland China. For full control — or to avoid both — self-host the Apache-2.0 weights on your own cloud. US export-control and procurement questions apply.
Q · 05 How many languages does Qwen support? +
Qwen3 was trained on roughly 36 trillion tokens across 119 languages and dialects — one of the widest multilingual footprints of any major model family, which is a key reason for its global open-weight adoption.
Q · 06 What's the difference between Qwen3 Max and the open Qwen 3.5 models? +
Qwen3 Max is the proprietary, cloud-only flagship (multimodal, 252K context, $1.20/$6.00). The Qwen 3.5 open MoE models (397B-A17B, 122B-A10B) are Apache-2.0 and self-hostable, at lower per-token cost. Choose Max for the managed top tier; choose 3.5 open for control and self-hosting.
Q · 07 How does Qwen compare to DeepSeek and Llama? +
DeepSeek is cheaper per token but text-only with a narrow lineup; Qwen offers full-modality breadth (vision, coding, speech) and 300+ open models. Versus Meta's Llama, Qwen ships more sizes and modalities and tops global download charts, but Llama carries a Western governance posture some teams prefer.