CHATGPT INSTANT400K CONTEXTTEXT + VISIONPROMPT CACHINGMOVING ALIAS
chat-latest API Pricing
OpenAI's latest Instant model used in ChatGPT, exposed through the API as a moving alias. Current API pricing is $5/M input, $30/M output, and $0.5/M cached input. Pulled directly from developers.openai.com.
Input - per 1M tokens
$5.00/M
ChatGPT latest OpenAI row standard
Output - per 1M tokens
$30.00/M
Output billed separately standard
Cached input - 90% off
$0.50/M
Prompt cache hit price -90%
Effective - agentic blend
$3.61/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current chat-latest rates. Tweak workload split and cache hit rate, then share the URL to share the calculation.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
ASSISTANT
Everyday ChatGPT-style assistant
$0.034/turn
SUPPORT
Support answer drafting
$0.060/ticket
VISION
Image-heavy question
$0.076/ask
WRITING
Creative brief iteration
$0.080/draft
RAG
Knowledge-base answer
$0.105/answer
§ 03 / TAPE
Price history.
Input · $5/M
Output · $30/M
Cached · $0.50/M
MAY 05 GPT-5.5 Instant began rolling out to ChatGPT and the API as chat-latestMAY 18 Verified unchanged on OpenAI pricing and model docs
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · tiktoken-o200k_base · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| chat-latest Current | $5.00 cache $0.50 | $30.00 | $3.61 agentic 92/8 | 400K | Latest ChatGPT Instant alias |
| GPT-5.5 | $5.00 cache $0.50 | $30.00 | $3.61 same input/output | 1M | Production flagship API model |
| GPT-5.4 | $2.50 cache $0.25 | $15.00 | $1.80 cheaper frontier | 1.05M | Affordable OpenAI frontier work |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 smaller tier | 400K | Subagents and computer-use tasks |
| GPT-5.4 nano | $0.20 cache $0.02 | $1.25 | $0.15 lowest GPT tier | 400K | Classification and extraction |
| Claude Opus 4.7 | $5.00 cache $0.50 | $25.00 | $3.21 Anthropic frontier | 1M | Frontier reasoning and writing |
| Gemini 3.1 Pro Preview | $2.00 cache $0.20 | $12.00 | $1.44 Google preview | 1M | Large multimodal reasoning |
Frequently asked.
Practical chat-latest pricing questions, with OpenAI's published rates separated from workload assumptions.
Q · 01 What is the standard chat-latest API price? +
OpenAI lists
chat-latest at $5/M input tokens, $0.5/M cached input tokens, and $30/M output tokens. This page stores the public list price in USD and marks OpenAI's pricing docs as the source.Q · 02 How is the effective blended cost calculated? +
The effective tile uses the fixed AI//COST 92/8 agentic blend with 82% cache hits. For chat-latest, that works out to
$3.61/M using $5/M input, $0.5/M cached input, and $30/M output.Q · 03 Is Batch API pricing listed? +
OpenAI's model docs expose Batch as an endpoint, and the pricing table separately lists discounted Batch rows where available. If a model-specific Batch row is absent, this page does not invent one.
Q · 04 Does prompt caching change the input price? +
Yes. A cache hit is billed at
$0.5/M instead of $5/M. The calculator lets you set the hit rate to 82%, 50%, or off.Q · 05 Are regional processing prices different? +
OpenAI states that regional processing/data-residency endpoints can carry a 10% uplift for selected GPT-5.4/5.5 family models. This page uses the global public API list price unless the vendor row explicitly lists a regional surcharge.
Q · 06 Is chat-latest a pinned production model? +
No. OpenAI says
chat-latest points to the latest Instant model used in ChatGPT and that the underlying snapshot will be regularly updated. OpenAI recommends GPT-5.5 for production API usage when you need a stable model choice.Q · 07 What release date is used here? +
OpenAI's GPT-5.5 Instant post is dated
May 5, 2026 and says the model is available in the API as chat-latest. This page uses 2026-05-05 for the first observed pricing-history point.Q · 08 How accurate is the tokenizer estimate? +
The live widget uses an estimated
4.875 characters per token for English text and labels the tokenizer as tiktoken-o200k_base. Exact billing can differ by language, code, whitespace, tool calls, and image inputs.