Last verified 2026-05-18

ARCHIVE PRICELEGACY MODEL128KTEXT + VISIONPROMPT CACHING

GPT-4o mini API Pricing

Q: How much did GPT-4o mini cost?

GPT-4o mini is listed here at $0.15/M input and $0.6/M output.

Q: Is cached-input pricing included?

Yes. Cached input is shown at $0.075/M.

Q: What should teams use instead?

For new work, compare against gpt-5-4-mini and other current rows in the shelf.

GPT-4o mini is a legacy cheap-tier OpenAI model at $0.15/$0.60 per million tokens. $0.15/M input, $0.6/M output, and $0.075/M cached input. It remains important for old cost baselines and migration math toward GPT-5.4 mini.

Archived input - per 1M tokens

$0.15/M

Source OpenAI legacy

Archived output - per 1M tokens

$0.60/M

Use for invoice checks legacy

Cached input

$0.07/M

Cache hit price discounted

Effective - agentic blend

$0.13/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Archive calculator pre-loaded with GPT-4o mini rates. Tweak spend, output mix, or cache hit rate to compare legacy bills with current replacements.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

CODING AGENT ARCHIVE

Repo-wide bug fix

$0.011/task

81k in - 7k out~9,090 units/$100

LONG DOC ANALYSIS

Reading 100-page contracts

$0.020/doc

175k in - 8k out~5,000 units/$100

RAG SUPPORT

Support agent ticket triage

$0.001/ticket

4k in - 1k out~100,000 units/$100

ASSISTANT TURN

Research planning turn

$0.002/turn

12k in - 2k out~50,000 units/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Estimate · tiktoken-cl100k_base · ≈4 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters 285

Words 42

Tokens (estimated) 71 tokens

Cost as input · uncached $0.000011 USD

Cost as output · uncached $0.000043 USD

Cost as cached input $0.000005 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
GPT-4o mini	$0.15 cache $0.07	$0.60	$0.13 current page	128K	Legacy cheap OpenAI workloads
GPT-5.5	$5.00 cache $0.50	$30.00	$3.60 pricier	1M	Current replacement workloads
GPT-5.4	$2.50 cache $0.25	$15.00	$1.80 pricier	1.05M	Current replacement workloads
GPT-5.4 mini	$0.75 cache $0.07	$4.50	$0.54 pricier	400K	Current replacement workloads
Claude Sonnet 4.6	$3.00 cache $0.30	$15.00	$1.92 pricier	1M	Legacy Claude production baselines
Gemini 2.5 Pro	$1.25 cache $0.13	$10.00	$1.10 pricier	2M	Gemini long-context work
DeepSeek V4 Pro	$0.43 cache $0.00	$0.87	$0.14 pricier	1M	Low-cost reasoning and coding

§ 05 / DEEP LINKS

Specific scenarios.

All calculators →

By replacement

Compare current Claude → Compare current OpenAI → Compare current Gemini →

Frequently asked.

Short answers for teams checking GPT-4o mini pricing, status, and migration choices.

Q · 01 Is GPT-4o mini still available? +

It is a legacy model. Check OpenAI's current pricing page before starting new production traffic.

Q · 02 How much did GPT-4o mini cost? +

GPT-4o mini is listed here at $0.15/M input and $0.6/M output.

Q · 03 Is cached-input pricing included? +

Yes. Cached input is shown at $0.075/M.

Q · 04 What should teams use instead? +

For new work, compare against gpt-5-4-mini and other current rows in the shelf.

Q · 05 Why keep this archive page? +

Archive pages preserve old price baselines, help explain invoice history, and capture migration search demand without implying the model is the best current choice.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Last verified 2026-05-18 - Archive pricing

Methodology Report a correction More by Y.V.