Last verified
MULTIMODAL KIMI256K CONTEXTTEXT + VISION + VIDEOPROMPT CACHINGOPENAI-COMPATIBLE API

Kimi K2.5 API Pricing

Kimi K2.5 is Moonshot Kimi's active K2.5 multimodal model for coding, vision, and agent workflows. The live Kimi pricing surface lists $0.6/M input and $3/M output, with cache hits at $0.1/M. Pulled directly from platform.kimi.ai daily.

Input - per 1M tokens
$0.60/M
Source Kimi API flat
Output - per 1M tokens
$3.00/M
Context 256K flat
Cached input - per 1M tokens
$0.10/M
Cache automatic -83%
Effective - agentic blend
$0.41/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Kimi K2.5 rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Kimi K2.5 is listed at $0.6/M input and $3/M output on the live Kimi pricing surface.

Input · $0.60/M
Output · $3/M
Cached · $0.10/M
JAN 27 Release pricing at $0.6/M input and $3/M outputMAY 19 Live verification kept $0.6/M input and $3/M output
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · moonshot-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Kimi K2.5 Current $0.60 cache $0.10 $3.00 $0.41 agentic 92/8 262K Multimodal Kimi production
Kimi K2.6 $0.95 cache $0.16 $4.00 $0.60 pricier 262K Frontier Kimi agents
Kimi K2 (0905 Preview) $0.60 cache $0.15 $2.50 $0.41 same blend 262K Legacy K2 migration checks
Kimi K2 Turbo Preview $1.15 cache $0.15 $8.00 $0.94 pricier 262K Legacy fast K2 traffic
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 cheaper 1M Multimodal budget work
Doubao Seed 2.0 Pro $0.45 cache $0.09 $2.25 $0.32 cheaper 256K Chinese multimodal agents
DeepSeek V4 Flash $0.14 cache $0.00 $0.28 $0.05 cheaper 1M Budget reasoning and coding

Frequently asked.

Practical pricing questions, separated from calculator assumptions and regional taxes.

Q · 01 What is Kimi K2.5 priced at? +
Kimi K2.5 is listed at $0.6/M input, $3/M output, and $0.1/M cache-hit input. The Kimi pricing docs state that prices exclude applicable taxes.
Q · 02 How does prompt caching work? +
Kimi supports automatic context caching for this model. Cache-hit input is billed at $0.1/M instead of the fresh-input $0.6/M rate, and the calculator's default agentic blend assumes an 82% cache-hit rate.
Q · 03 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With an 82% cache hit rate, Kimi K2.5's effective blended cost is $0.41/M.
Q · 04 Does this include image and video input? +
Kimi K2.5 is documented as a native multimodal model with text, image, and video input. The pricing page bills consumed tokens, so extracted document, image, or video context should be treated as model input once passed into a request.
Q · 05 Is there a batch discount? +
Kimi documents BatchJob pricing separately from the live chat-completion model rows. This page stores the standard online inference price; batch economics should be modeled in a dedicated calculator variant when used.
Q · 06 Are taxes or regional surcharges included? +
No. The Kimi pricing page says listed prices exclude applicable taxes and that checkout applies tax based on jurisdiction. AI//COST stores the vendor USD token price before taxes.
Q · 07 How accurate is the tokenizer estimate? +
The browser widget uses a moonshot-tokenizer-estimate chars-per-token estimate for English text. Actual billing comes from Kimi API usage fields and can differ for Chinese, code, images, video, or mixed-language prompts.