Last verified 2026-07-27

OPEN 8B128K CONTEXTHYBRID MODESINTERNATIONAL PRICETEXT + CODE

Qwen3 8B API Pricing

Q: What is the Qwen3 8B input price?

Alibaba's Singapore/International pricing row lists $0.18/M for input tokens. Alibaba publishes that row in US dollars, so no currency conversion is involved.

Q: What is the Qwen3 8B output price?

Qwen3 8B is shown at $0.7/M output on this page. Alibaba's International row prices non-thinking output at that rate and thinking output at $2.1/M.

Qwen3 8B is an Alibaba Qwen3 text model priced from the Singapore/International row of Model Studio. The verified rate is $0.18/M input and $0.7/M output. The Qwen3 launch blog lists Qwen3 8B as an Apache 2.0 dense model with 128K context.

Input - per 1M tokens

$0.18/M

Source Alibaba Model Studio flat

Output - per 1M tokens

$0.70/M

Mode non-thinking baseline flat

Thinking output $2.1/M

$2.10/M

Thinking chain + answer reasoning

Effective - agentic blend

$0.22/M

92/8 split - no cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with the verified International Qwen3 8B rates. Use it for invoice checks, agent traces, and scenario planning before large Model Studio runs.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CHATBOT

Support answer

$0.0013/turn

4,000 in - 900 out~74,074 units/$100

RAG

Policy analysis

$0.013/query

65,000 in - 2,200 out~7,553 units/$100

CODING

Function rewrite

$0.0057/task

18,000 in - 3,500 out~17,575 units/$100

BATCH

Document summary

$0.0052/doc

24,000 in - 1,200 out~19,380 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (Qwen/Qwen3.5-397B-A17B, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 392

Words 58

Tokens (estimated) 75 tokens

Cost as input · uncached $0.00001 USD

Cost as output · uncached $0.00005 USD

Cost as cached input $0.00001 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
Qwen3 8B Current	$0.18	$0.70	$0.22 agentic 92/8	128K	Small open Qwen3 deployment
Qwen3 Next 80B A3B Thinking	$0.15	$1.20	$0.23 thinking	262K	Qwen3 sibling
Qwen3 Next 80B A3B Instruct	$0.15	$1.20	$0.23 sibling	262K	Qwen3 sibling
Qwen3 235B A22B Thinking 2507	$0.23	$2.30	$0.40 thinking	262K	Qwen3 sibling
Qwen3 235B A22B Instruct 2507	$0.23	$0.92	$0.29 sibling	262K	Qwen3 sibling
Qwen3 30B A3B Thinking 2507	$0.20	$2.40	$0.38 thinking	262K	Qwen3 sibling
Qwen3 30B A3B Instruct 2507	$0.20	$0.80	$0.25 sibling	262K	Qwen3 sibling
Qwen3 30B A3B	$0.20	$0.80	$0.25 sibling	128K	Qwen3 sibling
Qwen3.7 Plus	$0.40	$1.60	$0.50 sibling	1M	Current Plus tier
Qwen3.7 Max	$2.50	$7.50	$2.90 sibling	1M	Current Max flagship

§ 05 / DEEP LINKS

Specific scenarios.

All calculators →

Audit links

Provider hub →

Frequently asked.

Short answers for teams comparing Qwen3 8B against other current Qwen3 text models.

Q · 01 What is the Qwen3 8B input price? +

Alibaba's Singapore/International pricing row lists $0.18/M for input tokens. Alibaba publishes that row in US dollars, so no currency conversion is involved.

Q · 02 What is the Qwen3 8B output price? +

Qwen3 8B is shown at $0.7/M output on this page. Alibaba's International row prices non-thinking output at that rate and thinking output at $2.1/M.

Q · 03 Does Alibaba list prompt-cache pricing for this row? +

No exact cache-read token price is listed in the verified row, so the calculator keeps cache disabled instead of inventing a discount.

Q · 04 Why does this page use International pricing? +

AI//COST uses the Singapore/International row for Alibaba pages because it is the relevant baseline for non-mainland deployment and differs from some US/EU global rows.

Q · 05 Is this a text LLM page only? +

Yes. This page covers the text/code API model. Audio, image, and video models are intentionally handled separately.

Q · 06 Where did the context window come from? +

Context comes from Qwen's official model/blog documentation: Qwen3 launch blog lists Qwen3-8B with 128K context. Pricing comes separately from Alibaba Model Studio.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Last verified Jun 09, 2026 - Source: help.aliyun.com

Methodology Report a correction More by Y.V.