Last verified
BUDGET MOE32K CONTEXTTEXT ONLYUNIFIED TOKEN RATENO CACHE DISCOUNT

Baichuan4 Air API Pricing

Baichuan4 Air is Baichuan's lowest-cost 4-series API tier. The official pricing page lists $0.138/M unified, converted from 0.00098 yuan per 1K tokens at 7.10 CNY/USD, with the same rate billed for input and output. No separate cache-hit discount is published. Pulled directly from platform.baichuan-ai.com daily.

Input - per 1M tokens
$0.14/M
Source Baichuan flat
Output - per 1M tokens
$0.14/M
Unified same as input flat
Cached input - no separate discount
$0.14/M
Cache not listed 0%
Effective - agentic blend
$0.14/M
92/8 split - no cache discount
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Baichuan4 Air rates. Tweak spend or token volume, then share the URL to share the estimate.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Baichuan4 Air has held at $0.138/M unified across our verified live snapshots.

Input · $0.14/M
Output · $0.14/M
Cached · $0.14/M
MAY 18 First AI//COST verified snapshot stored the $0.138/M unified rateMAY 23 Live verification kept the same $0.138/M unified rate
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · baichuan-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Baichuan4 Air Current $0.14 cache $0.14 $0.14 $0.14 agentic 92/8 32K Lowest-cost Baichuan API traffic
Baichuan4 Turbo $2.11 cache $2.11 $2.11 $2.11 pricier 32K Balanced Baichuan production traffic
Baichuan4 $14.09 cache $14.09 $14.09 $14.09 pricier 32K Premium Baichuan 4-series quality
Baichuan-M3-Plus $0.70 cache $0.70 $1.27 $0.75 pricier 32K Medical copilots with lower hallucination risk
Baichuan-M3 $1.41 cache $1.41 $4.22 $1.63 pricier 32K Higher-depth medical reasoning
GLM-4.7 FlashX $0.07 cache $0.01 $0.40 $0.05 cheaper 200K Ultra-cheap Chinese API traffic
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 pricier 1M Global multimodal budget workloads

Frequently asked.

Practical pricing questions for Baichuan4 Air, especially when comparing Chinese-market budget models.

Q · 01 What is Baichuan4 Air priced at? +
Baichuan's official pricing page lists Baichuan4 Air at roughly $0.138/M unified. That comes from 0.00098 yuan per 1K tokens converted at 7.10 CNY/USD, and the same rate applies to both prompt and completion tokens.
Q · 02 Is Baichuan4 Air Baichuan's cheapest current paid model? +
Yes on the current public pricing page. Baichuan4 Air is much cheaper than Baichuan4 Turbo, Baichuan4, Baichuan-M3-Plus, and Baichuan-M3 on a per-token basis.
Q · 03 Does it have cache pricing? +
No separate cache-hit discount is listed for Baichuan4 Air. AI//COST therefore treats cached input as the same rate as normal input instead of guessing at an unpublished discount.
Q · 04 How does it compare with Gemini 2.5 Flash? +
Baichuan4 Air is cheaper on raw token price: about $0.138/M unified versus Gemini 2.5 Flash's $0.30/M input and $2.50/M output. Gemini still brings a very different global multimodal feature set and a much larger context window.
Q · 05 Is there a public batch-discount row? +
No separate batch-discount row appears on Baichuan's official pricing page for Baichuan4 Air. Treat the quote board as list pricing unless Baichuan publishes another billing mode.
Q · 06 How accurate is the tokenizer estimate? +
The browser widget uses a baichuan-tokenizer-estimate chars-per-token approximation for English planning. Real billing is set by Baichuan's server-side token count and can differ for Chinese, code, or mixed-language prompts.