LEGACY QWEN1M CONTEXTTEXTQWEN 3.5 PLUS REPLACEMENT
Qwen Plus API Pricing
Qwen Plus is the legacy balanced Qwen row for text workloads. Alibaba lists the International/Singapore baseline at $0.4/M input and $1.2/M output; thinking output is separately listed at $4/M. Pulled directly from alibabacloud.com daily.
Input - per 1M tokens
$0.40/M
Source Alibaba Cloud flat
Output - per 1M tokens
$1.20/M
Context 1M flat
Cache N/A
$0.40/M
Cache no dollar row not listed
Effective - agentic blend
$0.46/M
92/8 split - no cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current Qwen Plus rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
LEGACY CHAT
Support assistant
$0.002/turn
RAG
Knowledge base answer
$0.005/query
LONG CONTEXT
Policy review
$0.050/doc
MIGRATION
A/B comparison
$0.004/test
§ 03 / TAPE
Price history.
Input · $0.40/M
Output · $1.2/M
Cached · $0.40/M
DEC 01 Qwen Plus current snapshot listed at $0.4/M input and $1.2/M outputMAY 19 Live verification kept $0.4/M input and $1.2/M output
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · qwen-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| Qwen Plus Current | $0.40 | $1.20 | $0.46 agentic 92/8 | 1M | Legacy balanced Qwen apps |
| Qwen 3.5 Plus | $0.40 | $2.40 | $0.56 pricier | 256K | Current Qwen production default |
| Qwen 3.5 Flash | $0.10 | $0.40 | $0.12 cheaper | 1M | Cheap long-context Qwen traffic |
| Qwen Max (2.5) | $1.60 | $6.40 | $1.98 pricier | 32K | Legacy Qwen 2.5 compatibility |
| Qwen Flash (2.5) | $0.05 | $0.40 | $0.08 cheaper | 1M | Legacy cheap long-context apps |
| Qwen Turbo | $0.05 | $0.20 | $0.06 cheaper | 1M | Deprecated Turbo compatibility |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 pricier | 400K | OpenAI mini coding and CUA |
| Gemini 2.5 Flash | $0.30 cache $0.03 | $2.50 | $0.27 cheaper | 1M | Google long-context Flash workloads |
Frequently asked.
Practical pricing questions, separated from calculator assumptions and regional tiers.
Q · 01 What is Qwen Plus priced at? +
Qwen Plus is listed at
$0.4/M input and $1.2/M output in Alibaba Cloud Model Studio's International/Singapore deployment section. The page stores USD per-million-token pricing.Q · 02 What replaced Qwen Plus? +
Qwen 3.5 Plus is the newer Alibaba row to compare first for new projects. Qwen Plus remains listed for compatibility, but new traffic should usually benchmark the replacement before staying on the legacy SKU.
Q · 03 Does this page use International or Global pricing? +
This page uses Alibaba Cloud Model Studio International deployment pricing, where endpoint and data storage are in Singapore and inference resources are dynamically scheduled globally excluding Chinese Mainland. Global, US, EU, China (Hong Kong), and Chinese Mainland sections can list different prices.
Q · 04 Is prompt caching priced separately? +
Alibaba marks context-cache support on some Qwen rows, but this row does not publish a concrete cache-read dollar price in the pricing table. The calculator therefore treats cached input as the same
$0.4/M baseline instead of inventing a discount.Q · 05 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With no separate cache-read price published for this row, Qwen Plus's effective blended cost is
$0.46/M.Q · 06 Is there a Batch Invocation discount? +
Alibaba documents Batch Invocation as 50% off real-time input and output tokens for supported Qwen rows. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant.
Q · 07 Does Alibaba include a free quota? +
Many International Model Studio rows include a 1 million token free quota that is valid for 90 days after activating Model Studio. Free-quota eligibility is deployment- and model-specific, so production estimates should use the paid list prices shown here.