ORIGINAL M2205K CONTEXTTEXT ONLYPROMPT CACHINGAGENTIC CODING
MiniMax M2 API Pricing
MiniMax M2 is MiniMax's original M2 agentic coding model. The official pricing page lists $0.3/M input, $1.2/M output, and $0.03/M cached input. Pulled directly from platform.minimax.io daily.
Input - per 1M tokens
$0.30/M
Direct USD vendor row base tier
Output - per 1M tokens
$1.20/M
Direct USD vendor row 4x input
Cached input - prompt cache read
$0.03/M
Cache write $0.375/M 90% off
Effective - agentic blend
$0.17/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current MiniMax M2 rates. Tweak workload split and cache hit rate, then share the URL to share the calculation.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
CODING AGENT
Repo repair task
$0.020/task
OFFICE
Analyst memo draft
$0.016/memo
SEARCH
Web research brief
$0.011/brief
CHATBOT
Product support turn
$0.003/turn
§ 03 / TAPE
Price history.
Input · $0.30/M
Output · $1.2/M
Cached · $0.03/M
OCT 27 M2 launched for agents and codeMAY 23 Verified unchanged on MiniMax pay-as-you-go pricing docs
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · minimax-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| MiniMax M2 Current | $0.30 cache $0.03 | $1.20 | $0.17 agentic 92/8 | 205K | Original M2 coding agents |
| MiniMax M2.1 | $0.30 cache $0.03 | $1.20 | $0.17 same effective tier | 205K | Stable coding and office agents |
| MiniMax M2.5 | $0.30 cache $0.03 | $1.20 | $0.17 same effective tier | 205K | Value-first MiniMax agent loops |
| MiniMax M2-her | $0.30 | $1.20 | $0.37 pricier MiniMax sibling | 64K | Role-play and long dialogue |
| MiniMax M2.7 | $0.30 cache $0.06 | $1.20 | $0.19 pricier MiniMax sibling | 205K | MiniMax flagship agent loops |
| MiniMax M2.1 Highspeed | $0.60 cache $0.03 | $2.40 | $0.31 pricier MiniMax sibling | 205K | Lower-latency M2.1 agents |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 verified shelf sibling | 400K | OpenAI subagent workloads |
| DeepSeek V4 Pro | $0.43 cache $0.00 | $0.87 | $0.14 verified shelf sibling | 1M | Low-cost reasoning workloads |
Frequently asked.
Practical MiniMax M2 pricing questions, with live MiniMax list rates separated from workload assumptions.
Q · 01 What is the standard MiniMax M2 API price? +
MiniMax's official pay-as-you-go pricing page lists
MiniMax-M2 at $0.3/M input and $1.2/M output. Cache reads are listed at $0.03/M. AI//COST stores those direct USD list prices without currency conversion.Q · 02 Does MiniMax publish prompt caching for this model? +
MiniMax lists prompt-cache reads at
$0.03/M and cache writes at $0.375/M for MiniMax M2. The quote board uses the read price because repeated cache hits drive recurring workload cost.Q · 03 What context window does MiniMax M2 support? +
MiniMax's text-generation docs list
204,800 tokens of context for this M2-family text model. This page rounds that to 205K for display consistency.Q · 04 Is there a Batch API discount? +
MiniMax's current pay-as-you-go table does not publish a separate Batch API discount for
MiniMax M2. Treat the public $0.3/M input and $1.2/M output rates as the default unless your account has a negotiated enterprise agreement.Q · 05 When was this price last checked? +
This page was verified against MiniMax's official pay-as-you-go pricing page on
May 23, 2026. The same page currently lists the active text rows for M2.7, M2.5, M2.1, M2, and M2-her.Q · 06 How accurate is the tokenizer estimate? +
The browser widget uses a
minimax-tokenizer-estimate chars-per-token approximation for English planning. Real billing depends on MiniMax server-side tokenization and can differ for Chinese, code, and mixed-language prompts.