Last verified 2026-05-19

MID-TIER GLM200K CONTEXT128K OUTPUTPROMPT CACHINGTEXT + CODE

GLM-4.7 API Pricing

Q: What is GLM-4.7 priced at?

GLM-4.7 is listed at $0.6/M input and $2.2/M output on the live Z.AI pricing table. This page stores USD per-million-token pricing.

Q: Is prompt caching priced separately?

Yes. The vendor table lists cached input at $0.11/M versus $0.6/M fresh input. Cached input storage is listed as limited-time free on the Z.AI page.

Q: How accurate is the tokenizer estimate?

The browser widget uses a zhipu-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

GLM-4.7 is Zhipu's mid-tier GLM model for coding, tool use, and general agent workloads. The live vendor table lists $0.6/M input and $2.2/M output, with cached input at $0.11/M. Pulled directly from docs.z.ai daily.

Input - per 1M tokens

$0.60/M

Source Z.AI flat

Output - per 1M tokens

$2.20/M

Context 200K flat

Cached input - per 1M tokens

$0.11/M

Storage limited-time free -82%

Effective - agentic blend

$0.36/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current GLM-4.7 rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CODING AGENT

Repo implementation

$0.020/task

22,000 in - 3,000 out~5,051 units/$100

CODE REVIEW

Pull request review

$0.012/review

14,000 in - 1,800 out~8,065 units/$100

RAG

Knowledge base answer

$0.008/query

9,000 in - 1,000 out~13,158 units/$100

CHATBOT

Support assistant

$0.003/turn

2,500 in - 600 out~35,714 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (zai-org/GLM-5, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 480

Words 71

Tokens (estimated) 91 tokens

Cost as input · uncached $0.000055 USD

Cost as output · uncached $0.000200 USD

Cost as cached input $0.000010 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
GLM-4.7 Current	$0.60 cache $0.11	$2.20	$0.36 agentic 92/8	200K	Mid-tier GLM agents
GLM-5.1	$1.40 cache $0.26	$4.40	$0.78 pricier	200K	Flagship GLM coding
GLM-4.6	$0.60 cache $0.11	$2.20	$0.36 same blend	200K	Previous mid-tier GLM
GLM-4.7 FlashX	$0.07 cache $0.01	$0.40	$0.05 cheaper	200K	Ultra-cheap GLM traffic
GLM-4.7 Flash	$0.00 cache $0.00	$0.00	$0.00 cheaper	200K	Free registered-user tier
Qwen3 Coder Flash	$0.30	$1.50	$0.40 pricier	1M	Cheap code-focused Qwen
DeepSeek V4 Flash	$0.14 cache $0.00	$0.28	$0.05 cheaper	1M	Budget reasoning and coding

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional tiers.

Q · 01 What is GLM-4.7 priced at? +

GLM-4.7 is listed at $0.6/M input and $2.2/M output on the live Z.AI pricing table. This page stores USD per-million-token pricing.

Q · 02 How is the effective price calculated? +

AI//COST uses the same 92/8 agentic blend everywhere. With an 82% cache hit rate, GLM-4.7's effective blended cost is $0.36/M.

Q · 03 Is prompt caching priced separately? +

Yes. The vendor table lists cached input at $0.11/M versus $0.6/M fresh input. Cached input storage is listed as limited-time free on the Z.AI page.

Q · 04 Are regional prices different? +

Z.AI publishes the official developer pricing page in USD; Chinese BigModel pages may surface the same model catalog in Chinese. The quote tiles use the baseline row named in the source page, not a reseller or proxy price.

Q · 05 Is there a batch discount? +

No separate batch discount row is listed on the Z.AI pricing page for GLM text models. The quote tiles show real-time list pricing; batch economics should be treated as a separate calculator variant when supported.

Q · 06 How accurate is the tokenizer estimate? +

The browser widget uses a zhipu-tokenizer-estimate chars-per-token estimate for English text. It is useful for rough planning, but actual billing comes from the vendor API usage fields and can differ for Chinese, code, or mixed-language prompts.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from docs.z.ai - Last verified May 19, 2026

Methodology Report a correction More by Y.V.