Last verified
CODE 32BLEGACY QWEN 2.5131K CONTEXTTEXT + CODEBATCH -50%

Qwen2.5 Coder 32B Instruct API Pricing

Qwen2.5 Coder 32B Instruct is Alibaba's legacy Qwen2.5 code-specialized 32B row for compatibility and invoice checks. The live vendor table lists $0.287/M input and $0.861/M output. Pulled directly from alibabacloud.com daily.

Input - per 1M tokens
$0.29/M
Source Alibaba Cloud flat
Output - per 1M tokens
$0.86/M
Context 131K flat
Cache N/A
$0.29/M
Cache no dollar row not listed
Effective - agentic blend
$0.33/M
92/8 split - no cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Qwen2.5 Coder 32B Instruct rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Qwen2.5 Coder 32B Instruct is listed at $0.287/M input and $0.861/M output on the live Alibaba Cloud pricing table.

Input · $0.29/M
Output · $0.86/M
Cached · $0.29/M
NOV 12 Launch-wave list price stored at $0.287/M input and $0.861/M outputMAY 19 Live verification kept $0.287/M input and $0.861/M output
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · qwen-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Qwen2.5 Coder 32B Instruct Current $0.29 $0.86 $0.33 agentic 92/8 131K Legacy code-specialized Qwen 2.5
Qwen3 Coder Plus $1.00 $5.00 $1.32 pricier 1M Current Qwen code flagship
Qwen3 Coder Flash $0.30 $1.50 $0.40 pricier 1M Cheap current code workloads
Qwen2.5 32B Instruct $0.70 $2.80 $0.87 pricier 131K Legacy 32B general chat
Qwen3 32B $0.16 $0.64 $0.20 cheaper 131K Current open 32B reasoning
DeepSeek V4 Flash $0.14 cache $0.00 $0.28 $0.05 cheaper 1M Budget coding and reasoning

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is Qwen2.5 Coder 32B Instruct priced at? +
Qwen2.5 Coder 32B Instruct is listed at $0.287/M input and $0.861/M output on the live Alibaba Cloud pricing table. This page stores USD per-million-token pricing.
Q · 02 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With no separate cache-read price published, Qwen2.5 Coder 32B Instruct's effective blended cost is $0.33/M.
Q · 03 Is prompt caching priced separately? +
No concrete cache-read dollar row is published for this model on the vendor table. The calculator therefore treats cached input as the same $0.287/M baseline instead of inventing a discount.
Q · 04 Are regional prices different? +
This page uses Alibaba Cloud Model Studio International/Singapore pricing, where endpoint and data storage are in Singapore and inference resources are scheduled globally excluding Chinese Mainland. The quote tiles use the baseline row named in the source page, not a reseller or proxy price.
Q · 05 Is there a batch discount? +
Alibaba documents Batch Invocation as 50% off real-time input and output tokens for supported Qwen rows. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant when supported.
Q · 06 How accurate is the tokenizer estimate? +
The browser widget uses a qwen-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.