CODE BUDGET1M CONTEXTSINGAPORE BASELINELOW COST
Qwen3 Coder Flash API Pricing
Qwen3 Coder Flash is Alibaba's lower-cost Qwen3 code model for high-volume coding assistants. Baseline International/Singapore rates are $0.3/M input and $1.5/M output. Pulled directly from alibabacloud.com daily.
Input - per 1M tokens
$0.30/M
Source Alibaba Cloud flat
Output - per 1M tokens
$1.50/M
Context 1M flat
Cache N/A
$0.30/M
Cache not listed not listed
Effective - agentic blend
$0.40/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current Qwen3 Coder Flash rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
CODING AGENT
Repo patch
$0.048/task
CODE REVIEW
Pull request review
$0.019/review
TEST GEN
Unit test drafting
$0.009/file
CHATBOT
Developer assistant
$0.002/turn
§ 03 / TAPE
Price history.
Input · $0.30/M
Output · $1.5/M
Cached · $0.30/M
JUL 28 Launch baseline $0.30/M input and $1.50/M outputMAY 19 Live verification kept $0.3/M input and $1.5/M output
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · qwen-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| Qwen3 Coder Flash Current | $0.30 | $1.50 | $0.40 agentic 92/8 | 1M | High-volume coding assistants |
| Qwen3 Max | $1.20 | $6.00 | $1.58 pricier | 252K | Frontier Qwen proprietary reasoning |
| Qwen 3.5 Plus | $0.40 | $2.40 | $0.56 pricier | 256K | General Qwen production workloads |
| Qwen 3.5 Flash | $0.10 | $0.40 | $0.12 cheaper | 1M | Cheap long-context Qwen traffic |
| Qwen3 Coder Plus | $1.00 | $5.00 | $1.32 pricier | 1M | Agentic coding and code review |
| QwQ Plus | $0.80 | $2.40 | $0.93 pricier | 131K | Proprietary reasoning workloads |
| QwQ 32B | $0.29 | $0.86 | $0.33 cheaper | 131K | Open reasoning on a budget |
| Qwen3 235B A22B | $0.70 | $2.80 | $0.87 pricier | 131K | Open MoE reasoning baseline |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 pricier | 400K | Coding and computer-use workloads |
| Gemini 2.5 Flash | $0.30 cache $0.03 | $2.50 | $0.27 cheaper | 1M | Low-latency multimodal RAG |
| DeepSeek V4 Flash | $0.14 cache $0.00 | $0.28 | $0.05 cheaper | 1M | Ultra-cheap API throughput |
Frequently asked.
Practical pricing questions, separated from calculator assumptions and regional tiers.
Q · 01 What is Qwen3 Coder Flash priced at? +
Qwen3 Coder Flash is shown at
$0.3/M input and $1.5/M output on Alibaba Cloud Model Studio's International/Singapore deployment section.Q · 02 Does this page include higher context pricing tiers? +
The quote tiles use the baseline tier for the queue. Alibaba publishes higher long-context tiers for some Qwen rows; for example Qwen3 Coder Plus rises above the 0-32K band, and Qwen3 235B A22B has a separate thinking-mode output price where applicable.
Q · 03 Is prompt caching priced separately? +
Alibaba lists Qwen3 Coder Flash in the same Qwen-Coder pricing table, but the International row does not publish a separate cache-read price. The calculator treats cached input as
$0.30/M rather than inventing a discount.Q · 04 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With no exact cache-read dollar price published for this row, Qwen3 Coder Flash's effective blended cost is
$0.4/M.Q · 05 Is there a free quota or batch discount? +
Alibaba lists a 90-day activation free quota for many International Qwen rows, but not every open-source row includes one. Batch and context-cache support are model-specific; this page only publishes prices that are explicit in the vendor table.
Q · 06 Are regional prices different? +
Yes. Alibaba Cloud publishes separate International, Global, China (Hong Kong), EU, US, and Chinese Mainland deployment sections. AI//COST uses the International/Singapore baseline for Alibaba queue pages unless the page says otherwise.