Analysis
How prompt caching actually changes the LLM cost math
Anthropic charges 10% of input for cache reads, with a 1.25x write fee. OpenAI auto-caches above 1024 tokens. The math changes which LLM is cheapest -- here is when.
2 posts tagged "cost-optimization".
Every post tagged "cost optimization" in the journal. Tag archives are auto-generated from post frontmatter -- one entry per unique tag across non-draft posts. Currently 2 posts share this tag. Use the category filter above to scope by editorial type.
Anthropic charges 10% of input for cache reads, with a 1.25x write fee. OpenAI auto-caches above 1024 tokens. The math changes which LLM is cheapest -- here is when.
Showing 1 of 2 posts — page 1 of 1
Caveman mode strips filler from Claude's output. The 65% headline is real for output tokens. Here's what it actually saves on your monthly bill.