Last verified 2026-07-11

DEPRECATED131K CONTEXTAPACHE 2.0NO FREE QUOTA

QwQ 32B API Pricing

Q: What is QwQ 32B priced at?

QwQ 32B is shown at $0.287/M input and $0.861/M output on Alibaba Cloud Model Studio's International/Singapore deployment section.

Q: Is prompt caching priced separately?

QwQ 32B is listed in the QwQ open-source section at $0.287/M input and $0.861/M output. Alibaba does not list a cache-read price or free quota for this row.

Deprecated: QwQ 32B was removed from Alibaba Model Studio's International pricing page (verified 2026-07-11) - the QwQ line is superseded by the Qwen3 reasoning models. Still open-weight (Apache 2.0) for self-hosting; its last-listed hosted rates were $0.287/M input and $0.861/M output, retained for reference.

Input - per 1M tokens

$0.29/M

Source Alibaba Cloud flat

Output - per 1M tokens

$0.86/M

Context 131K flat

Cache N/A

$0.29/M

Cache not listed not listed

Effective - agentic blend

$0.33/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current QwQ 32B rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

REASONING

Math solution

$0.006/problem

12,000 in - 2,500 out~17,857 units/$100

RAG

Policy analysis

$0.021/query

65,000 in - 2,200 out~4,878 units/$100

AGENT

Research branch

$0.030/task

90,000 in - 5,000 out~3,322 units/$100

CHATBOT

Expert answer

$0.002/turn

4,000 in - 900 out~52,631 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (Qwen/Qwen3.5-397B-A17B, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 413

Words 64

Tokens (estimated) 79 tokens

Cost as input · uncached $0.000023 USD

Cost as output · uncached $0.000068 USD

Cost as cached input $0.000023 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
QwQ 32B Current	$0.29	$0.86	$0.33 agentic 92/8	131K	Open reasoning on a budget
Qwen3 Max	$1.20	$6.00	$1.58 pricier	252K	Frontier Qwen proprietary reasoning
Qwen 3.5 Plus	$0.40	$2.40	$0.56 pricier	256K	General Qwen production workloads
Qwen 3.5 Flash	$0.10	$0.40	$0.12 cheaper	1M	Cheap long-context Qwen traffic
Qwen3 Coder Plus	$1.00	$5.00	$1.32 pricier	1M	Agentic coding and code review
Qwen3 Coder Flash	$0.30	$1.50	$0.40 pricier	1M	High-volume coding assistants
QwQ Plus	$0.80	$2.40	$0.93 pricier	131K	Proprietary reasoning workloads
Qwen3 235B A22B	$0.70	$2.80	$0.87 pricier	131K	Open MoE reasoning baseline
GPT-5.4 mini	$0.75 cache $0.07	$4.50	$0.54 pricier	400K	Coding and computer-use workloads
Gemini 2.5 Flash	$0.30 cache $0.03	$2.50	$0.27 cheaper	1M	Low-latency multimodal RAG
DeepSeek V4 Flash	$0.14 cache $0.00	$0.28	$0.05 cheaper	1M	Ultra-cheap API throughput

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is QwQ 32B priced at? +

QwQ 32B is shown at $0.287/M input and $0.861/M output on Alibaba Cloud Model Studio's International/Singapore deployment section.

Q · 02 Does this page include higher context pricing tiers? +

The quote tiles use the baseline tier for the queue. Alibaba publishes higher long-context tiers for some Qwen rows; for example Qwen3 Coder Plus rises above the 0-32K band, and Qwen3 235B A22B has a separate thinking-mode output price where applicable.

Q · 03 Is prompt caching priced separately? +

QwQ 32B is listed in the QwQ open-source section at $0.287/M input and $0.861/M output. Alibaba does not list a cache-read price or free quota for this row.

Q · 04 How is the effective price calculated? +

AI//COST uses the same 92/8 agentic blend everywhere. With no exact cache-read dollar price published for this row, QwQ 32B's effective blended cost is $0.33/M.

Q · 05 Is there a free quota or batch discount? +

Alibaba lists a 90-day activation free quota for many International Qwen rows, but not every open-source row includes one. Batch and context-cache support are model-specific; this page only publishes prices that are explicit in the vendor table.

Q · 06 Are regional prices different? +

Yes. Alibaba Cloud publishes separate International, Global, China (Hong Kong), EU, US, and Chinese Mainland deployment sections. AI//COST uses the International/Singapore baseline for Alibaba queue pages unless the page says otherwise.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from alibabacloud.com - Last verified July 11, 2026

Methodology Report a correction More by Y.V.