Last verified
LOWER LATENCY205K CONTEXTTEXT ONLYPROMPT CACHING100 TPS TARGET

MiniMax M2.5 Highspeed API Pricing

MiniMax M2.5 Highspeed is MiniMax's lower-latency M2.5 value tier. The official pricing page lists $0.6/M input, $2.4/M output, and $0.03/M cached input. Pulled directly from platform.minimax.io daily.

Input - per 1M tokens
$0.60/M
Direct USD vendor row 2x base tier
Output - per 1M tokens
$2.40/M
Direct USD vendor row 2x base tier
Cached input - prompt cache read
$0.03/M
Cache write $0.375/M 95% off
Effective - agentic blend
$0.31/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current MiniMax M2.5 Highspeed rates. Tweak workload split and cache hit rate, then share the URL to share the calculation.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

MiniMax M2.5 Highspeed still lists $0.6/M input and $2.4/M output on MiniMax's current pay-as-you-go page.

Input · $0.60/M
Output · $2.4/M
Cached · $0.03/M
FEB 12 M2.5 Highspeed launched as the faster M2.5 variantMAY 23 Verified unchanged on MiniMax pay-as-you-go pricing docs
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · minimax-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
MiniMax M2.5 Highspeed Current $0.60 cache $0.03 $2.40 $0.31 agentic 92/8 205K Fast M2.5 coding and agents
MiniMax M2.5 $0.30 cache $0.03 $1.20 $0.17 cheaper MiniMax sibling 205K Value-first MiniMax agent loops
MiniMax M2.1 Highspeed $0.60 cache $0.03 $2.40 $0.31 same effective tier 205K Lower-latency M2.1 agents
MiniMax M2.7 Highspeed $0.60 cache $0.06 $2.40 $0.34 pricier MiniMax sibling 205K Faster MiniMax flagship loops
MiniMax M2.7 $0.30 cache $0.06 $1.20 $0.19 cheaper MiniMax sibling 205K MiniMax flagship agent loops
MiniMax M2 $0.30 cache $0.03 $1.20 $0.17 cheaper MiniMax sibling 205K Original M2 coding agents
GPT-5.4 mini $0.75 cache $0.07 $4.50 $0.54 verified shelf sibling 400K OpenAI subagent workloads
DeepSeek V4 Pro $0.43 cache $0.00 $0.87 $0.14 verified shelf sibling 1M Low-cost reasoning workloads

Frequently asked.

Practical MiniMax M2.5 Highspeed pricing questions, with live MiniMax list rates separated from workload assumptions.

Q · 01 What is the standard MiniMax M2.5 Highspeed API price? +
MiniMax's official pay-as-you-go pricing page lists MiniMax-M2.5 Highspeed at $0.6/M input and $2.4/M output. Cache reads are listed at $0.03/M. AI//COST stores those direct USD list prices without currency conversion.
Q · 02 Does MiniMax publish prompt caching for this model? +
MiniMax lists prompt-cache reads at $0.03/M and cache writes at $0.375/M for MiniMax M2.5 Highspeed. The quote board uses the read price because repeated cache hits drive recurring workload cost.
Q · 03 What context window does MiniMax M2.5 Highspeed support? +
MiniMax's text-generation docs list 204,800 tokens of context for this M2-family text model. This page rounds that to 205K for display consistency.
Q · 04 Is there a Batch API discount? +
MiniMax's current pay-as-you-go table does not publish a separate Batch API discount for MiniMax M2.5 Highspeed. Treat the public $0.6/M input and $2.4/M output rates as the default unless your account has a negotiated enterprise agreement.
Q · 05 When was this price last checked? +
This page was verified against MiniMax's official pay-as-you-go pricing page on May 23, 2026. The same page currently lists the active text rows for M2.7, M2.5, M2.1, M2, and M2-her.
Q · 06 How accurate is the tokenizer estimate? +
The browser widget uses a minimax-tokenizer-estimate chars-per-token approximation for English planning. Real billing depends on MiniMax server-side tokenization and can differ for Chinese, code, and mixed-language prompts.