Question 1

Is Claude Opus 4.5 still the most expensive frontier model?

Accepted Answer

No. After Anthropic's repricing, Opus 4.5 sits at $5/M input and $25/M output — the same list rate as Opus 4.6 and Opus 4.7. Older Opus tiers (4.1, legacy 4) remain at the previous $15/$75 level. With an 82% cache hit rate and a 92/8 input-output workload, Opus 4.5's effective blended cost is $3.21/M.

Question 2

Why is Opus 4.5 cheaper now than at launch?

Accepted Answer

Anthropic published the current pricing table that lists Opus 4.5 alongside Opus 4.6 and 4.7 all at $5/M input and $25/M output. The earlier launch rate ($15/M / $75/M) is still the active price for Opus 4.1 and legacy Opus 4. Anthropic does not publicly document the date the Opus 4.5 cut took effect; we re-verify the live page when updating this entry.

Question 3

How does prompt caching work?

Accepted Answer

Mark sections of your prompt (system message, examples, retrieved context) as cacheable. Anthropic stores them server-side for either 5 minutes or 1 hour. Subsequent requests that include the cached section get billed at $0.50/M instead of $5/M — a 90% discount. Writing to cache costs 1.25× base for the 5-minute TTL ($6.25/M) and 2× for the 1-hour TTL ($10/M). For long-running agentic workloads with stable system prompts, hit rates of 80–90% are typical.

Question 4

Is there a Batch API discount?

Accepted Answer

Yes. Anthropic's Batch API gives a flat 50% discount on both input and output tokens. For Opus 4.5, the batch table lists $2.50/M input and $12.50/M output, versus the standard $5/M and $25/M rates.

Question 5

Are there geo-routing or regional surcharges?

Accepted Answer

The default first-party Claude API uses global routing at standard pricing. For Opus 4.6, Sonnet 4.6, and later models, setting inference_geo: "us" applies a 1.1× multiplier to every token category. The Opus 4.5 endpoint also responds to inference_geo; older models (Opus 4.1, legacy Opus 4) do not support the parameter at all and return a 400 error if you pass it. Bedrock and Vertex AI publish their own regional pricing — typically a 10% premium for regional vs global endpoints on 4.5-generation models and later.

Question 6

How accurate is your tokenizer?

Accepted Answer

We use Anthropic's official claude-tiktoken package, the same tokenizer the API uses for billing. Counts match production billing within ±1%. Languages with non-Latin scripts (Chinese, Japanese, Russian, Arabic) tokenize at higher ratios than English — typically 1.5–2× as many tokens per word.

Question 7

Are volume discounts published?

Accepted Answer

No fixed volume-discount ladder is published. Anthropic says volume discounts may be available for high-volume users and are negotiated case by case. Standard API users should assume the public rates shown here unless they have a separate enterprise agreement.

Model	Input /M	Output /M	Effective blended	Context	Best for
Claude Opus 4.5 Current	$5.00 cache $0.50	$25.00	$3.21 agentic 92/8	200K	Frontier coding · long-form writing
Claude Opus 4.7	$5.00 cache $0.50	$25.00	$3.21 same list price	1M	Newest Opus · 1M context
Claude Opus 4.6	$5.00 cache $0.50	$25.00	$3.21 same list price	1M	Opus generation peer
Claude Sonnet 4.6	$3.00 cache $0.30	$15.00	$1.92 cheaper	1M	Daily-driver agents · coding
Claude Sonnet 4.5	$3.00 cache $0.30	$15.00	$1.92 same list as 4.6	200K	Stable Sonnet migrations
Claude Haiku 4.5	$1.00 cache $0.10	$5.00	$0.64 cheaper	200K	Support · classification
Claude Opus 4.1	$15.00 cache $1.50	$75.00	$9.62 previous Opus tier	200K	Legacy Opus pricing — being phased out

Claude Opus 4.5 API Pricing

Run the numbers.

Real-world presets.

Personal AI assistant

Solo dev pair-programming

Reading 100-page contracts

Customer support RAG

Paste text. See tokens. See cost.

Up against the shelf.

Frequently asked.