Last verified
GLM-4.5 BASE128K CONTEXT96K OUTPUTPROMPT CACHINGTEXT + CODE

GLM-4.5 API Pricing

GLM-4.5 is Z.AI's base 4.5 MoE model for reasoning, coding, and agent-oriented applications. The live vendor table lists $0.6/M input and $2.2/M output, with cached input at $0.11/M.

Input - per 1M tokens
$0.60/M
Source Z.AI flat
Output - per 1M tokens
$2.20/M
Context 128K flat
Cached input - per 1M tokens
$0.11/M
Storage limited-time free -82%
Effective - agentic blend
$0.36/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current GLM-4.5 rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
Open full calculator (all models · share URL · CSV) →
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

GLM-4.5 is listed at $0.6/M input and $2.2/M output on the live Z.AI pricing table.

Input · $0.60/M
Output · $2.2/M
Cached · $0.11/M
JUL 28 GLM-4.5 launch-wave price stored at $0.6/M input and $2.2/M outputJUN 09 Live verification kept $0.6/M input, $0.11/M cached input, and $2.2/M output
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (zai-org/GLM-5, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 462
Words 67
Tokens (estimated) 88 tokens
Cost as input · uncached $0.000053 USD
Cost as output · uncached $0.000194 USD
Cost as cached input $0.000010 USD
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
GLM-4.5 Current $0.60 cache $0.11 $2.20 $0.36 agentic 92/8 128K Open-weight GLM-4.5 base
GLM-4.5 X $2.20 cache $0.45 $8.90 $1.42 sibling 128K Premium 4.5 reasoning
GLM-4.5 Air $0.20 cache $0.03 $1.10 $0.14 sibling 128K Budget 4.5 agents
GLM-4.5 AirX $1.10 cache $0.22 $4.50 $0.71 sibling 128K Faster Air tier
GLM-4.6 $0.60 cache $0.11 $2.20 $0.36 sibling 200K Newer open GLM coding
GLM-4.7 $0.60 cache $0.11 $2.20 $0.36 sibling 200K Current coding workhorse
GLM-5.1 $1.40 cache $0.26 $4.40 $0.78 sibling 200K Current GLM flagship
GLM-4.7 FlashX $0.07 cache $0.01 $0.40 $0.05 sibling 200K Ultra-cheap GLM traffic
§ 06 / DEEP LINKS

Specific scenarios.

All calculators →

Audit links

Frequently asked.

Short answers for teams comparing GLM-4.5 with GLM-4.5 Air, GLM-4.5 X, GLM-4.6, and GLM-5 tiers.

Q · 01 What is GLM-4.5's input price? +
Z.AI lists GLM-4.5 at $0.6 per 1M input tokens.
Q · 02 What is GLM-4.5's output price? +
The live Z.AI pricing table lists GLM-4.5 output at $2.2 per 1M output tokens.
Q · 03 Does GLM-4.5 support cached input pricing? +
Yes. Z.AI lists cached input at $0.11/M, with cached-input storage marked as limited-time free.
Q · 04 What is the GLM-4.5 context window? +
The GLM-4.5 model page lists a 128K context length and 96K maximum output tokens.
Q · 05 Is GLM-4.5 a text LLM? +
Yes. The Z.AI GLM-4.5 overview lists text input and text output; this page treats it as a text/code model.
Q · 06 How is the effective price calculated? +
The effective tile uses AI//COST's standard agentic blend: 92% input, 8% output, and 82% cached-input reuse where the vendor lists a cache rate.