Last verified
OPEN 7BLEGACY QWEN 2.5131K CONTEXTTEXTBATCH -50%

Qwen2.5 7B Instruct API Pricing

Qwen2.5 7B Instruct is Alibaba's legacy open-weight 7B Qwen2.5 row for compatibility and invoice checks. Alibaba lists the International/Singapore baseline at $0.175/M input and $0.7/M output; newer workloads should compare Qwen3 14B or Qwen3 32B. Pulled directly from alibabacloud.com daily.

Input - per 1M tokens
$0.17/M
Source Alibaba Cloud flat
Output - per 1M tokens
$0.70/M
Context 131K flat
Cache N/A
$0.17/M
Cache no dollar row not listed
Effective - agentic blend
$0.22/M
92/8 split - no cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Qwen2.5 7B Instruct rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Legacy row remains listed at $0.175/M input and $0.7/M output on Alibaba's International baseline.

Input · $0.17/M
Output · $0.70/M
Cached · $0.17/M
SEP 19 Qwen2.5 7B Instruct launch-wave pricing stored at $0.175/M input and $0.7/M outputMAY 19 Live verification kept $0.175/M input and $0.7/M output
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · qwen-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Qwen 2.5 7B Instruct Current $0.17 $0.70 $0.22 agentic 92/8 131K Legacy small open Qwen deployments
Qwen 2.5 14B Instruct $0.35 $1.40 $0.43 pricier 131K Legacy compact open Qwen apps
Qwen 2.5 32B Instruct $0.70 $2.80 $0.87 pricier 131K Legacy 32B open chat workloads
Qwen3 14B $0.35 $1.40 $0.43 pricier 131K Current compact open Qwen reasoning
Qwen3 32B $0.16 $0.64 $0.20 cheaper 131K Current open 32B chat and reasoning
Qwen 3.5 Flash $0.10 $0.40 $0.12 cheaper 1M Cheap long-context Qwen traffic
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 pricier 1M Google long-context Flash workloads

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is Qwen2.5 7B Instruct priced at? +
Qwen2.5 7B Instruct is listed at $0.175/M input and $0.7/M output in Alibaba Cloud Model Studio's International/Singapore deployment section. The page stores USD per-million-token pricing.
Q · 02 What replaced Qwen2.5 7B Instruct? +
Qwen2.5 7B Instruct is a legacy Qwen 2.5 compatibility row. For new workloads, compare Qwen3 14B or Qwen3 32B or the current Qwen3 family before staying on the older SKU.
Q · 03 Does this page use International or Global pricing? +
This page uses Alibaba Cloud Model Studio International deployment pricing, where endpoint and data storage are in Singapore and inference resources are dynamically scheduled globally excluding Chinese Mainland. Global, US, EU, China (Hong Kong), and Chinese Mainland sections can list different prices.
Q · 04 Is prompt caching priced separately? +
Alibaba marks context-cache support on some Qwen rows, but this row does not publish a concrete cache-read dollar price in the pricing table. The calculator therefore treats cached input as the same $0.175/M baseline instead of inventing a discount.
Q · 05 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With no separate cache-read price published for this row, Qwen2.5 7B Instruct's effective blended cost is $0.22/M.
Q · 06 Is there a Batch Invocation discount? +
Alibaba documents Batch Invocation as 50% off real-time input and output tokens for supported Qwen rows. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant.
Q · 07 Does Alibaba include a free quota? +
Many International Model Studio rows include a 1 million token free quota that is valid for 90 days after activating Model Studio. Free-quota eligibility is deployment- and model-specific, so production estimates should use the paid list prices shown here.