Last verified 2026-05-19

FAST GLM-4.5128K CONTEXTPROMPT CACHINGTEXT + CODEUSD PRICING

GLM-4.5 AirX API Pricing

Q: What is GLM-4.5 AirX priced at?

GLM-4.5 AirX is listed at $1.1/M input and $4.5/M output on the live Z.AI pricing table. Cached input is listed at $0.22/M. This page stores USD per-million-token pricing.

Q: Is prompt caching priced separately?

Yes. The vendor table lists cached input at $0.22/M versus $1.1/M fresh input. Cached input storage is listed as limited-time free on the Z.AI page.

Q: How accurate is the tokenizer estimate?

The browser widget uses a zhipu-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

GLM-4.5 AirX is the fast mid-tier GLM-4.5 SKU between Air and X. The live Z.AI pricing table lists $1.1/M input and $4.5/M output, with cached input at $0.22/M. Pulled directly from docs.z.ai daily.

Input - per 1M tokens

$1.10/M

Source Z.AI flat

Output - per 1M tokens

$4.50/M

Context 128K flat

Cached input - per 1M tokens

$0.22/M

Storage limited-time free -80%

Effective - agentic blend

$0.71/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current GLM-4.5 AirX rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CODING AGENT

Small repo implementation

$0.038/task

22,000 in - 3,000 out~2,652 units/$100

CODE REVIEW

Pull request review

$0.024/review

14,000 in - 1,800 out~4,255 units/$100

RAG

Knowledge base answer

$0.014/query

9,000 in - 1,000 out~6,944 units/$100

CHATBOT

Support assistant

$0.005/turn

2,500 in - 600 out~18,348 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (zai-org/GLM-5, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 482

Words 72

Tokens (estimated) 92 tokens

Cost as input · uncached $0.000101 USD

Cost as output · uncached $0.000414 USD

Cost as cached input $0.000020 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
GLM-4.5 AirX Current	$1.10 cache $0.22	$4.50	$0.71 agentic 92/8	128K	Fast GLM-4.5 production
GLM-4.5 X	$2.20 cache $0.45	$8.90	$1.42 pricier	128K	Harder GLM-4.5 reasoning
GLM-4.5 Air	$0.20 cache $0.03	$1.10	$0.14 cheaper	128K	Budget GLM-4.5 production
GLM-4.7	$0.60 cache $0.11	$2.20	$0.36 cheaper	200K	Mid-tier GLM agents
GLM-5	$1.00 cache $0.20	$3.20	$0.57 cheaper	200K	Balanced GLM flagship
GLM-4.7 FlashX	$0.07 cache $0.01	$0.40	$0.05 cheaper	200K	Ultra-cheap GLM traffic
DeepSeek V4 Flash	$0.14 cache $0.00	$0.28	$0.05 cheaper	1M	Budget reasoning and coding
Qwen 3.5 Flash	$0.10	$0.40	$0.12 cheaper	1M	Bulk Qwen long-context work

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is GLM-4.5 AirX priced at? +

GLM-4.5 AirX is listed at $1.1/M input and $4.5/M output on the live Z.AI pricing table. Cached input is listed at $0.22/M. This page stores USD per-million-token pricing.

Q · 02 How is the effective price calculated? +

AI//COST uses the same 92/8 agentic blend everywhere. With an 82% cache hit rate, cached input uses the vendor cache-read row. GLM-4.5 AirX's effective blended cost is $0.71/M.

Q · 03 Is prompt caching priced separately? +

Yes. The vendor table lists cached input at $0.22/M versus $1.1/M fresh input. Cached input storage is listed as limited-time free on the Z.AI page.

Q · 04 Are regional prices different? +

Z.AI publishes the official developer pricing page in USD. Chinese BigModel pages may surface overlapping model catalogs, but the quote tiles use the baseline row from the Z.AI developer pricing page, not a reseller or proxy price.

Q · 05 Is there a batch discount? +

No separate batch discount row is listed on the Z.AI pricing page for GLM-4.5 AirX. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant only when the vendor documents it.

Q · 06 How accurate is the tokenizer estimate? +

The browser widget uses a zhipu-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from docs.z.ai - Last verified May 19, 2026

Methodology Report a correction More by Y.V.