Last verified 2026-07-11

LOWER LATENCY205K CONTEXTTEXT ONLYPROMPT CACHING100 TPS TARGET

MiniMax M2.5 Highspeed API Pricing

Q: What is the standard MiniMax M2.5 Highspeed API price?

MiniMax's official pay-as-you-go pricing page lists MiniMax-M2.5 Highspeed at $0.6/M input and $2.4/M output. Cache reads are listed at $0.03/M. AI//COST stores those direct USD list prices without currency conversion.

Q: Does MiniMax publish prompt caching for this model?

MiniMax lists prompt-cache reads at $0.03/M and cache writes at $0.375/M for MiniMax M2.5 Highspeed. The quote board uses the read price because repeated cache hits drive recurring workload cost.

Q: What context window does MiniMax M2.5 Highspeed support?

MiniMax's text-generation docs list 204,800 tokens of context for this M2-family text model. This page rounds that to 205K for display consistency.

Q: How accurate is the tokenizer estimate?

The browser widget uses a minimax-tokenizer-estimate chars-per-token approximation for English planning. Real billing depends on MiniMax server-side tokenization and can differ for Chinese, code, and mixed-language prompts.

MiniMax M2.5 Highspeed is MiniMax's lower-latency M2.5 value tier. The official pricing page lists $0.6/M input, $2.4/M output, and $0.03/M cached input. Pulled directly from platform.minimax.io daily.

Input - per 1M tokens

$0.60/M

Direct USD vendor row 2x base tier

Output - per 1M tokens

$2.40/M

Direct USD vendor row 2x base tier

Cached input - prompt cache read

$0.03/M

Cache write $0.375/M 95% off

Effective - agentic blend

$0.31/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current MiniMax M2.5 Highspeed rates. Tweak workload split and cache hit rate, then share the URL to share the calculation.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CODING AGENT

Repo repair task

$0.041/task

60k in / 2k out~2,450 units/$100

OFFICE

Analyst memo draft

$0.031/memo

40k in / 3k out~3,205 units/$100

Web research brief

$0.022/brief

25k in / 3k out~4,504 units/$100

CHATBOT

Product support turn

$0.006/turn

6k in / 1k out~16,666 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Calibrated · measured on the vendor's tokenizer · 2026-06-10 Auto-counts as you type

Counts use a chars-per-token calibration measured on the vendor's own published tokenizer (MiniMaxAI/MiniMax-M2.1, 2026-06-10). English prose is typically within a few percent; code and non-Latin scripts tokenize heavier. For billing-exact counts use the vendor's count-tokens API.

Characters 437

Words 71

Tokens (estimated) 82 tokens

Cost as input · uncached $0.000049 USD

Cost as output · uncached $0.000197 USD

Cost as cached input $0.000002 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
MiniMax M2.5 Highspeed Current	$0.60 cache $0.03	$2.40	$0.31 agentic 92/8	205K	Fast M2.5 coding and agents
MiniMax M2.5	$0.30 cache $0.03	$1.20	$0.17 cheaper MiniMax sibling	205K	Value-first MiniMax agent loops
MiniMax M2.1 Highspeed	$0.60 cache $0.03	$2.40	$0.31 same effective tier	205K	Lower-latency M2.1 agents
MiniMax M2.7 Highspeed	$0.60 cache $0.06	$2.40	$0.34 pricier MiniMax sibling	205K	Faster MiniMax flagship loops
MiniMax M2.7	$0.30 cache $0.06	$1.20	$0.19 cheaper MiniMax sibling	205K	MiniMax flagship agent loops
MiniMax M2	$0.30 cache $0.03	$1.20	$0.17 cheaper MiniMax sibling	205K	Original M2 coding agents
GPT-5.4 mini	$0.75 cache $0.07	$4.50	$0.54 verified shelf sibling	400K	OpenAI subagent workloads
DeepSeek V4 Pro	$0.43 cache $0.00	$0.87	$0.14 verified shelf sibling	1M	Low-cost reasoning workloads

Frequently asked.

Practical MiniMax M2.5 Highspeed pricing questions, with live MiniMax list rates separated from workload assumptions.

Q · 01 What is the standard MiniMax M2.5 Highspeed API price? +

MiniMax's official pay-as-you-go pricing page lists MiniMax-M2.5 Highspeed at $0.6/M input and $2.4/M output. Cache reads are listed at $0.03/M. AI//COST stores those direct USD list prices without currency conversion.

Q · 02 Does MiniMax publish prompt caching for this model? +

MiniMax lists prompt-cache reads at $0.03/M and cache writes at $0.375/M for MiniMax M2.5 Highspeed. The quote board uses the read price because repeated cache hits drive recurring workload cost.

Q · 03 What context window does MiniMax M2.5 Highspeed support? +

MiniMax's text-generation docs list 204,800 tokens of context for this M2-family text model. This page rounds that to 205K for display consistency.

Q · 04 Is there a Batch API discount? +

MiniMax's current pay-as-you-go table does not publish a separate Batch API discount for MiniMax M2.5 Highspeed. Treat the public $0.6/M input and $2.4/M output rates as the default unless your account has a negotiated enterprise agreement.

Q · 05 When was this price last checked? +

This page was verified against MiniMax's official pay-as-you-go pricing page on May 23, 2026. The same page currently lists the active text rows for M2.7, M2.5, M2.1, M2, and M2-her.

Q · 06 How accurate is the tokenizer estimate? +

The browser widget uses a minimax-tokenizer-estimate chars-per-token approximation for English planning. Real billing depends on MiniMax server-side tokenization and can differ for Chinese, code, and mixed-language prompts.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from platform.minimax.io - Last verified July 11, 2026

Methodology Report a correction More by Y.V.