Last verified
RETIRED2M CONTEXTARCHIVE PRICEREDIRECTS TO GROK-4-3

Grok 4 Fast API Pricing

Grok 4 Fast is now an archive row: historical pricing was $0.20/M input, $0.05/M cached input, and $0.50/M output. xAI's May 15 retirement guide says retired Fast slugs redirect to Grok 4.3.

Input - per 1M tokens
$0.20/M
Historical Fast tier archive
Output - per 1M tokens
$0.50/M
Archive fast price archive
Cached input
$0.05/M
Prompt cache discount
Effective - agentic blend
$0.11/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Calculator pre-loaded with Grok 4 Fast archive rates. After May 15, 2026, xAI routes retired fast slugs to Grok 4.3 pricing.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Grok 4 Fast retired on 2026-05-15; archive price remains $0.20/$0.50 per M.

Input · $0.20/M
Output · $0.50/M
Cached · $0.05/M
SEP 19 Launched at $0.2/M input and $0.5/M outputMAY 18 Retirement verified; slug redirects to Grok 4.3 after May 15, 2026
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · grok-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Grok 4 Fast Current $0.20 cache $0.05 $0.50 $0.11 archive 92/8 2M Archive fast-tier invoices
Grok 4.3 $1.25 cache $0.20 $2.50 $0.56 current replacement 1M Current Grok default
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 budget Gemini 1M Low-cost multimodal work
GPT-5.4 mini $0.75 cache $0.07 $4.50 $0.54 OpenAI mini 400K Subagents and lightweight coding
DeepSeek V4 Pro $0.43 cache $0.00 $0.87 $0.14 budget reasoning 1M Low-cost reasoning alternative

Frequently asked.

Grok 4 Fast pricing questions, with archive status separated from token math.

Q · 01 What was Grok 4 Fast's API price? +
This archive page uses $0.2/M input, $0.05/M cached input, and $0.5/M output. Use it for historical billing and migration math, not fresh production routing.
Q · 02 Is Grok 4 Fast still available? +
xAI's May 15, 2026 retirement guide lists grok-4-fast-reasoning and grok-4-fast-non-reasoning as retired. After retirement, requests redirect to grok-4.3.
Q · 03 What should replace it? +
Use grok-4.3. xAI's current pricing page lists Grok 4.3 at $1.25/M input, $0.20/M cached input, and $2.50/M output.
Q · 04 Does prompt caching apply? +
The archive row includes cached input at $0.05/M. The effective blend assumes 82% cache hits; set cache to off in the calculator for fresh-input workloads.
Q · 05 Why keep an archive page? +
Retired model prices still matter for old invoices, migration plans, benchmarks, and generation-to-generation price history. The page is marked as archive so it does not imply current endpoint availability.
Q · 06 How accurate is the tokenizer estimate? +
The widget uses 4.875 characters per token as a planning estimate. Exact billing can vary with language, hidden reasoning, cached prompt boundaries, and tool usage.