Last verified
VISION MODEL256K CONTEXTBATCH -50%TEXT + VISION

Qwen3 VL Plus API Pricing

Qwen3 VL Plus is Qwen vision-language plus tier. Baseline rates are $0.2/M input and $1.6/M output. Pulled directly from alibabacloud.com and re-verified against the pricing page.

Input - per 1M tokens
$0.20/M
Source alibabacloud.com flat
Output - per 1M tokens
$1.60/M
Context 256K flat
Cache N/A
$0.20/M
Cache vendor row not listed
Effective - agentic blend
$0.31/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with Qwen3 VL Plus rates. Tweak spend, output mix, or cache assumptions to compare it with sibling models.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Listed at $0.2/M input and $1.6/M output.

Input · $0.20/M
Output · $1.6/M
Cached · $0.20/M
DEC 19 Launch price $0.2/M input and $1.6/M outputMAY 18 Live verification kept $0.2/M and $1.6/M
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · qwen-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Qwen3 VL Plus Current $0.20 $1.60 $0.31 agentic 92/8 256K Vision and document understanding
Qwen3 Max $1.20 $6.00 $1.58 pricier 252K Frontier Qwen reasoning
Qwen 3.5 Plus $0.40 $2.40 $0.56 pricier 256K General Qwen production workloads
Qwen 3.5 Flash $0.10 $0.40 $0.12 cheaper 1M Bulk chat and long-context RAG
Qwen3 VL Plus Current $0.20 $1.60 $0.31 agentic 92/8 256K Vision and document understanding
Qwen3 VL Flash $0.05 $0.40 $0.08 cheaper 256K Vision and document understanding
GPT-5.4 mini $0.75 cache $0.07 $4.50 $0.54 pricier 400K Mistral production workloads
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 cheaper 1M Mistral production workloads

Frequently asked.

Practical pricing questions, separated from calculator assumptions.

Q · 01 What is Qwen3 VL Plus priced at? +
Qwen3 VL Plus is shown at $0.2/M input and $1.6/M output. The page stores USD per-million-token baseline pricing from alibabacloud.com.
Q · 02 Does this page include higher context pricing tiers? +
Alibaba publishes tiered pricing for several Qwen models. This page uses the baseline Singapore / International tier from the queue and snapshot; higher-token tiers are noted in the source page and can be added as a variant later.
Q · 03 Is prompt caching priced separately? +
No separate cache-read price is published for this row, so the calculator treats cached input as $0.2/M.
Q · 04 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. For Qwen3 VL Plus, that gives $0.31/M with only documented cache discounts included.
Q · 05 Is there a batch discount? +
Alibaba lists Batch Invocation at 50% off for supported Qwen rows.
Q · 06 Are regional prices different? +
Yes. Alibaba Cloud publishes separate International, Global, US, EU, China (Hong Kong), and Chinese Mainland deployment sections. AI//COST uses the International / Singapore baseline for this queue unless a page explicitly says otherwise.