Last verified 2026-05-19

CODE 32BLEGACY QWEN 2.5131K CONTEXTTEXT + CODEBATCH -50%

Qwen2.5 Coder 32B Instruct API Pricing

Q: What is Qwen2.5 Coder 32B Instruct priced at?

Qwen2.5 Coder 32B Instruct is listed at $0.287/M input and $0.861/M output on the live Alibaba Cloud pricing table. This page stores USD per-million-token pricing.

Q: How accurate is the tokenizer estimate?

The browser widget uses a qwen-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

Qwen2.5 Coder 32B Instruct is Alibaba's legacy Qwen2.5 code-specialized 32B row for compatibility and invoice checks. The live vendor table lists $0.287/M input and $0.861/M output. Pulled directly from alibabacloud.com daily.

Input - per 1M tokens

$0.29/M

Source Alibaba Cloud flat

Output - per 1M tokens

$0.86/M

Context 131K flat

Cache N/A

$0.29/M

Cache no dollar row not listed

Effective - agentic blend

$0.33/M

92/8 split - no cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Qwen2.5 Coder 32B Instruct rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CODING AGENT

Repo implementation

$0.009/task

22,000 in - 3,000 out~11,236 units/$100

CODE REVIEW

Pull request review

$0.006/review

14,000 in - 1,800 out~17,857 units/$100

RAG

Knowledge base answer

$0.003/query

9,000 in - 1,000 out~29,412 units/$100

CHATBOT

Support assistant

$0.001/turn

2,500 in - 600 out~83,333 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (Qwen/Qwen3.5-397B-A17B, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 466

Words 65

Tokens (estimated) 89 tokens

Cost as input · uncached $0.000026 USD

Cost as output · uncached $0.000077 USD

Cost as cached input $0.000026 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
Qwen2.5 Coder 32B Instruct Current	$0.29	$0.86	$0.33 agentic 92/8	131K	Legacy code-specialized Qwen 2.5
Qwen3 Coder Plus	$1.00	$5.00	$1.32 pricier	1M	Current Qwen code flagship
Qwen3 Coder Flash	$0.30	$1.50	$0.40 pricier	1M	Cheap current code workloads
Qwen2.5 32B Instruct	$0.70	$2.80	$0.87 pricier	131K	Legacy 32B general chat
Qwen3 32B	$0.16	$0.64	$0.20 cheaper	131K	Current open 32B reasoning
DeepSeek V4 Flash	$0.14 cache $0.00	$0.28	$0.05 cheaper	1M	Budget coding and reasoning

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is Qwen2.5 Coder 32B Instruct priced at? +

Qwen2.5 Coder 32B Instruct is listed at $0.287/M input and $0.861/M output on the live Alibaba Cloud pricing table. This page stores USD per-million-token pricing.

Q · 02 How is the effective price calculated? +

AI//COST uses the same 92/8 agentic blend everywhere. With no separate cache-read price published, Qwen2.5 Coder 32B Instruct's effective blended cost is $0.33/M.

Q · 03 Is prompt caching priced separately? +

No concrete cache-read dollar row is published for this model on the vendor table. The calculator therefore treats cached input as the same $0.287/M baseline instead of inventing a discount.

Q · 04 Are regional prices different? +

This page uses Alibaba Cloud Model Studio International/Singapore pricing, where endpoint and data storage are in Singapore and inference resources are scheduled globally excluding Chinese Mainland. The quote tiles use the baseline row named in the source page, not a reseller or proxy price.

Q · 05 Is there a batch discount? +

Alibaba documents Batch Invocation as 50% off real-time input and output tokens for supported Qwen rows. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant when supported.

Q · 06 How accurate is the tokenizer estimate? +

The browser widget uses a qwen-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from alibabacloud.com - Last verified May 19, 2026

Methodology Report a correction More by Y.V.