Last verified
128K CONTEXTLONG-CONTEXT LEGACYTEXT ONLYUNIFIED TOKEN RATENO CACHE DISCOUNT

Baichuan3-Turbo (128K) API Pricing

Baichuan3-Turbo (128K) is the longer-context sibling of Baichuan3-Turbo on Baichuan's official pricing page. The live table lists $3.38/M unified, converted from 0.024 yuan per 1K tokens at 7.10 CNY/USD, with the same price billed for input and output. No separate cache-hit discount is published. Pulled directly from platform.baichuan-ai.com daily.

Input - per 1M tokens
$3.38/M
Source Baichuan 2x 32K row
Output - per 1M tokens
$3.38/M
Unified long-context row same as input
Cached input - no separate discount
$3.38/M
Cache not listed 0%
Effective - agentic blend
$3.38/M
92/8 split - no cache discount
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Baichuan3-Turbo 128K rates. Tweak spend or token volume, then share the URL to share the estimate.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Baichuan3-Turbo (128K) has held at $3.38/M unified across our verified live snapshots.

Input · $3.4/M
Output · $3.4/M
Cached · $3.4/M
MAY 18 First AI//COST verified snapshot stored the $3.38/M unified rateMAY 23 Live verification kept the same $3.38/M unified rate
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · baichuan-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Baichuan3-Turbo (128K) Current $3.38 cache $3.38 $3.38 $3.38 agentic 92/8 128K Long-context legacy workloads
Baichuan3-Turbo $1.69 cache $1.69 $1.69 $1.69 cheaper 32K Balanced legacy production traffic
Baichuan4 Turbo $2.11 cache $2.11 $2.11 $2.11 cheaper 32K Balanced Baichuan production traffic
Baichuan-M3 $1.41 cache $1.41 $4.22 $1.63 cheaper 32K Higher-depth medical reasoning
Baichuan4 $14.09 cache $14.09 $14.09 $14.09 pricier 32K Premium Baichuan 4-series quality
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 cheaper 1M Global multimodal budget workloads
DeepSeek V4 Pro $0.43 cache $0.00 $0.87 $0.14 cheaper 1M Frontier discount tier

Frequently asked.

Practical pricing questions for Baichuan3-Turbo 128K, especially around the cost of buying more context.

Q · 01 What is Baichuan3-Turbo 128K priced at? +
Baichuan's official pricing page lists Baichuan3-Turbo 128K at about $3.38/M on a unified basis. That USD figure comes from 0.024 yuan per 1K tokens converted at 7.10 CNY/USD.
Q · 02 Why does the 128K row cost double the 32K row? +
Baichuan's public table prices the 128K row at exactly twice the 32K Baichuan3-Turbo rate: $3.38/M versus $1.69/M. You are paying for the larger context window, not for a different input/output split.
Q · 03 Does Baichuan3-Turbo 128K have prompt-cache pricing? +
No separate cache-hit discount is listed for Baichuan3-Turbo 128K on the official pricing page. AI//COST therefore sets cached input equal to the normal token rate instead of inventing another billing mode.
Q · 04 Is it cheaper than Baichuan4? +
Yes by a wide margin. Baichuan4 is about $14.09/M unified on the same pricing page, while Baichuan3-Turbo 128K is about $3.38/M unified.
Q · 05 When does the 128K variant make sense? +
Only when you genuinely need the longer context. If your workload fits in 32K, the standard Baichuan3-Turbo row cuts the unified token rate in half.
Q · 06 How accurate is the tokenizer estimate? +
The browser widget uses a baichuan-tokenizer-estimate chars-per-token approximation for English text. Real billing comes from Baichuan's API usage counters and can differ for Chinese, code, or mixed-language prompts.