Last verified 2026-05-18

RETIRED256K CONTEXTARCHIVE PRICEREDIRECTS TO GROK-4-3

Grok 4.1 Fast API Pricing

Q: What was Grok 4.1 Fast's API price?

This archive page uses $0.2/M input, $0.05/M cached input, and $0.5/M output. Use it for historical billing and migration math, not fresh production routing.

Q: Is Grok 4.1 Fast still available?

xAI's May 15, 2026 retirement guide lists grok-4-1-fast-reasoning and grok-4-1-fast-non-reasoning as retired. After retirement, requests redirect to grok-4.3.

Q: What should replace it?

Use grok-4.3. xAI's current pricing page lists Grok 4.3 at $1.25/M input, $0.20/M cached input, and $2.50/M output.

Q: Does prompt caching apply?

The archive row includes cached input at $0.05/M. The effective blend assumes 82% cache hits; set cache to off in the calculator for fresh-input workloads.

Q: How accurate is the tokenizer estimate?

The widget uses 4.875 characters per token as a planning estimate. Exact billing can vary with language, hidden reasoning, cached prompt boundaries, and tool usage.

Grok 4.1 Fast is now an archive row: historical pricing was $0.20/M input, $0.05/M cached input, and $0.50/M output. xAI's May 15 retirement guide says 4.1 Fast slugs redirect to Grok 4.3.

Input - per 1M tokens

$0.20/M

Historical 4.1 Fast archive

Output - per 1M tokens

$0.50/M

Archive fast price archive

Cached input

$0.05/M

Prompt cache discount

Effective - agentic blend

$0.11/M

92/8 split - 82% cache

§ 01 / TERMINAL

Run the numbers.

Calculator pre-loaded with Grok 4.1 Fast archive rates. After May 15, 2026, xAI routes retired 4.1 Fast slugs to Grok 4.3 pricing.

Spend

$ /mo

Workload split

Prompt cache hit rate

Tokens you can process

—

Words equivalent (English)

—

Effective rate

—

Open full calculator (all models · share URL · CSV) →

§ 02 / SCENARIOS

Real-world presets.

Invoice replay

$0.077/1M in

1000k in / 0k out~1,298 runs/$100

AGENT

Legacy agent task

$0.010/task

80k in / 8k out~9,803 tasks/$100

MIGRATION

Migration cost comparison

$0.020/pack

200k in / 10k out~4,901 packs/$100

RAG

Archived RAG answer

$0.018/answer

150k in / 12k out~5,681 answers/$100

§ 03 / TOKENIZER

Paste text. See tokens. See cost.

Your text · live count

Estimate · grok-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters 254

Words 43

Tokens (estimated) 66 tokens

Cost as input · uncached $0.000013 USD

Cost as output · uncached $0.000033 USD

Cost as cached input $0.000003 USD

§ 04 / SHELF

Up against the shelf.

All models →

Model	Input /M	Output /M	Effective blended	Context	Best for
Grok 4.1 Fast Current	$0.20 cache $0.05	$0.50	$0.11 archive 92/8	256K	Archive 4.1 Fast invoices
Grok 4.3	$1.25 cache $0.20	$2.50	$0.56 current replacement	1M	Current Grok default
Gemini 2.5 Flash	$0.30 cache $0.03	$2.50	$0.27 budget Gemini	1M	Low-cost multimodal work
GPT-5.4 mini	$0.75 cache $0.07	$4.50	$0.54 OpenAI mini	400K	Subagents and lightweight coding
DeepSeek V4 Pro	$0.43 cache $0.00	$0.87	$0.14 budget reasoning	1M	Low-cost reasoning alternative

Frequently asked.

Grok 4.1 Fast pricing questions, with archive status separated from token math.

Q · 01 What was Grok 4.1 Fast's API price? +

This archive page uses $0.2/M input, $0.05/M cached input, and $0.5/M output. Use it for historical billing and migration math, not fresh production routing.

Q · 02 Is Grok 4.1 Fast still available? +

xAI's May 15, 2026 retirement guide lists grok-4-1-fast-reasoning and grok-4-1-fast-non-reasoning as retired. After retirement, requests redirect to grok-4.3.

Q · 03 What should replace it? +

Use grok-4.3. xAI's current pricing page lists Grok 4.3 at $1.25/M input, $0.20/M cached input, and $2.50/M output.

Q · 04 Does prompt caching apply? +

The archive row includes cached input at $0.05/M. The effective blend assumes 82% cache hits; set cache to off in the calculator for fresh-input workloads.

Q · 05 Why keep an archive page? +

Retired model prices still matter for old invoices, migration plans, benchmarks, and generation-to-generation price history. The page is marked as archive so it does not imply current endpoint availability.

Q · 06 How accurate is the tokenizer estimate? +

The widget uses 4.875 characters per token as a planning estimate. Exact billing can vary with language, hidden reasoning, cached prompt boundaries, and tool usage.

Reviewed by Yaroslav Vikhariev Founder - AI//COST - Pricing pulled daily from docs.x.ai - Last verified May 18, 2026

Methodology Report a correction More by Y.V.