Last verified
MINI REASONING400K CONTEXTTEXT + VISIONPROMPT CACHINGCOMPUTER USE

GPT-5.4 mini API Pricing

OpenAI's smaller GPT-5.4 model for coding, computer use, and subagents: $0.75/M input, $4.50/M output, and $0.075/M cached input. It keeps a 400K context window and the same text-plus-image input shape as the larger GPT-5.4 family.

Input - per 1M tokens
$0.75/M
Standard model tier standard
Output - per 1M tokens
$4.50/M
Batch/Flex output is $2.25/M standard
Cached input - 90% off
$0.07/M
Prompt cache hit price -90%
Effective - agentic blend
$0.54/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current GPT-5.4 mini rates. Tweak cache hit rate or workload split to compare subagent, support, and computer-use workloads.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

API list price is still $0.75/M input and $4.50/M output since the first GPT-5.4 mini snapshot.

Input · $0.75/M
Output · $4.5/M
Cached · $0.07/M
MAR 17 First listed snapshot at $0.75/M input and $4.50/M outputMAY 18 Verified unchanged on OpenAI pricing docs
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · tiktoken-o200k_base · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
GPT-5.4 mini Current $0.75 cache $0.07 $4.50 $0.54 agentic 92/8 400K Subagents and computer-use tasks
GPT-5.4 $2.50 cache $0.25 $15.00 $1.80 larger model 1.05M Affordable OpenAI frontier work
GPT-5.4 nano $0.20 cache $0.02 $1.25 $0.15 same blend 400K Classification and extraction
GPT-5.3-Codex $1.75 cache $0.17 $14.00 $1.54 coding specialist 400K Agentic coding tasks
Gemini 2.5 Pro $1.25 cache $0.13 $10.00 $1.10 tier 1 pricing 2M Large multimodal context
DeepSeek V4 Pro $0.43 cache $0.00 $0.87 $0.14 promo price 1M Low-cost reasoning workloads
DeepSeek V4 Flash $0.14 cache $0.00 $0.28 $0.05 same blend 1M Bulk low-cost traffic

Frequently asked.

Practical GPT-5.4 mini pricing questions, with OpenAI's published rates separated from workload assumptions.

Q · 01 What is the standard GPT-5.4 mini API price? +
OpenAI lists gpt-5.4-mini at $0.75/M input tokens, $0.075/M cached input tokens, and $4.50/M output tokens. This page stores the public list price in USD and marks the source as OpenAI's pricing docs.
Q · 02 How much cheaper is GPT-5.4 mini than GPT-5.4? +
GPT-5.4 mini is 3.3x cheaper than GPT-5.4 on the shared agentic blend: about $0.54/M versus $1.80/M. The tradeoff is model capability and context: mini has a 400K context window rather than GPT-5.4's larger 1.05M window.
Q · 03 How much cheaper are Batch and Flex? +
OpenAI lists Batch and Flex for gpt-5.4-mini at half the standard rate: $0.375/M input, $0.0375/M cached input, and $2.25/M output. That makes latency-tolerant subagent and support workloads materially cheaper.
Q · 04 What does Priority processing cost? +
The Priority table lists gpt-5.4-mini at $1.50/M input, $0.15/M cached input, and $9/M output. That is 2x the standard price, so use it only when latency matters.
Q · 05 Does GPT-5.4 mini have regional pricing? +
Yes. OpenAI states that regional processing/data-residency endpoints carry a 10% uplift for gpt-5.4-mini and related GPT-5.4/5.5 models. Standard global routing uses the public standard rate.
Q · 06 Does GPT-5.4 mini support computer use? +
Yes. OpenAI's model page lists computer use among the tools supported by gpt-5.4-mini in the Responses API. GPT-5.4 nano's model page does not list computer use support, which is one practical reason to pick mini for browser or desktop automation.
Q · 07 What release date is used here? +
OpenAI's model page lists the first GPT-5.4 mini snapshot as gpt-5.4-mini-2026-03-17. This page uses 2026-03-17 as the model release date for pricing-history purposes.
Q · 08 How accurate is the tokenizer estimate? +
The live widget uses an estimated 4.875 characters per token for English text and labels the tokenizer as tiktoken-o200k_base. It is good for budget planning, but exact billing can differ by language, whitespace, code, tool calls, and image inputs.