MULTIMODAL KIMI256K CONTEXTTEXT + VISION + VIDEOPROMPT CACHINGOPENAI-COMPATIBLE API
Kimi K2.5 API Pricing
Kimi K2.5 is Moonshot Kimi's active K2.5 multimodal model for coding, vision, and agent workflows. The live Kimi pricing surface lists $0.6/M input and $3/M output, with cache hits at $0.1/M. Pulled directly from platform.kimi.ai daily.
Input - per 1M tokens
$0.60/M
Source Kimi API flat
Output - per 1M tokens
$3.00/M
Context 256K flat
Cached input - per 1M tokens
$0.10/M
Cache automatic -83%
Effective - agentic blend
$0.41/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current Kimi K2.5 rates. Tweak spend, output mix, or cache assumptions and share the URL to share the calculation.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
CODING AGENT
Long-horizon coding
$0.034/task
VISION
Screenshot to implementation
$0.020/screen
DEEP RESEARCH
Research brief
$0.048/brief
CHATBOT
Support assistant
$0.004/turn
§ 03 / TAPE
Price history.
Input · $0.60/M
Output · $3/M
Cached · $0.10/M
JAN 27 Release pricing at $0.6/M input and $3/M outputMAY 19 Live verification kept $0.6/M input and $3/M output
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · moonshot-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| Kimi K2.5 Current | $0.60 cache $0.10 | $3.00 | $0.41 agentic 92/8 | 262K | Multimodal Kimi production |
| Kimi K2.6 | $0.95 cache $0.16 | $4.00 | $0.60 pricier | 262K | Frontier Kimi agents |
| Kimi K2 (0905 Preview) | $0.60 cache $0.15 | $2.50 | $0.41 same blend | 262K | Legacy K2 migration checks |
| Kimi K2 Turbo Preview | $1.15 cache $0.15 | $8.00 | $0.94 pricier | 262K | Legacy fast K2 traffic |
| Gemini 2.5 Flash | $0.30 cache $0.03 | $2.50 | $0.27 cheaper | 1M | Multimodal budget work |
| Doubao Seed 2.0 Pro | $0.45 cache $0.09 | $2.25 | $0.32 cheaper | 256K | Chinese multimodal agents |
| DeepSeek V4 Flash | $0.14 cache $0.00 | $0.28 | $0.05 cheaper | 1M | Budget reasoning and coding |
Frequently asked.
Practical pricing questions, separated from calculator assumptions and regional taxes.
Q · 01 What is Kimi K2.5 priced at? +
Kimi K2.5 is listed at
$0.6/M input, $3/M output, and $0.1/M cache-hit input. The Kimi pricing docs state that prices exclude applicable taxes.Q · 02 How does prompt caching work? +
Kimi supports automatic context caching for this model. Cache-hit input is billed at
$0.1/M instead of the fresh-input $0.6/M rate, and the calculator's default agentic blend assumes an 82% cache-hit rate.Q · 03 How is the effective price calculated? +
AI//COST uses the same 92/8 agentic blend everywhere. With an 82% cache hit rate, Kimi K2.5's effective blended cost is
$0.41/M.Q · 04 Does this include image and video input? +
Kimi K2.5 is documented as a native multimodal model with text, image, and video input. The pricing page bills consumed tokens, so extracted document, image, or video context should be treated as model input once passed into a request.
Q · 05 Is there a batch discount? +
Kimi documents BatchJob pricing separately from the live chat-completion model rows. This page stores the standard online inference price; batch economics should be modeled in a dedicated calculator variant when used.
Q · 06 Are taxes or regional surcharges included? +
No. The Kimi pricing page says listed prices exclude applicable taxes and that checkout applies tax based on jurisdiction. AI//COST stores the vendor USD token price before taxes.
Q · 07 How accurate is the tokenizer estimate? +
The browser widget uses a
moonshot-tokenizer-estimate chars-per-token estimate for English text. Actual billing comes from Kimi API usage fields and can differ for Chinese, code, images, video, or mixed-language prompts.