GEMINI 2.5 FLASH1M CONTEXTMULTIMODALCONTEXT CACHINGBATCH + FLEX -50%
Gemini 2.5 Flash API Pricing
Google's stable price-performance Gemini 2.5 model for low-latency, high-volume reasoning tasks: $0.3/M input, $2.5/M output, and $0.03/M cached input. Pulled directly from ai.google.dev.
Input - per 1M tokens
$0.30/M
Text/image/video standard
Output - per 1M tokens
$2.50/M
Includes thinking tokens standard
Cached input - 90% off
$0.03/M
Cache plus storage fee -90%
Effective - agentic blend
$0.27/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current Gemini 2.5 Flash standard rates. Standard text, image, and video input is $0.30/M; audio input is $1/M; output includes thinking tokens.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
CODING
Coding agent iteration
$0.044/task
LONG CONTEXT
Document pack analysis
$0.072/pack
MULTIMODAL
Video + docs briefing
$0.017/brief
AGENT
Research agent loop
$0.061/loop
RAG
Large RAG synthesis
$0.075/answer
§ 03 / TAPE
Price history.
Input · $0.30/M
Output · $2.5/M
Cached · $0.03/M
JUN 17 Released as the GA Gemini 2.5 Flash model in Google's Gemini API release notesMAY 18 Verified unchanged on Google Gemini API pricing page
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · gemini-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| Gemini 3.1 Pro Preview | $2.00 cache $0.20 | $12.00 | $1.44 frontier preview | 1M | Google frontier preview |
| Gemini 3 Flash Preview | $0.50 cache $0.05 | $3.00 | $0.36 preview flash | 1M | Cheaper Gemini 3 preview |
| Gemini 3.1 Flash-Lite | $0.25 cache $0.03 | $1.50 | $0.18 light tier | 1M | High-volume Gemini 3.1 |
| Gemini 2.5 Pro | $1.25 cache $0.13 | $10.00 | $1.10 stable pro | 2M | Stable Gemini 2.5 Pro |
| Gemini 2.5 Flash Current | $0.30 cache $0.03 | $2.50 | $0.27 agentic 92/8 | 1M | Best price-performance Gemini 2.5 |
| Gemini 2.5 Flash-Lite | $0.10 cache $0.01 | $0.40 | $0.06 budget tier | 1M | Cheapest stable Gemini 2.5 |
| GPT-5.4 | $2.50 cache $0.25 | $15.00 | $1.80 OpenAI competitor | 1.05M | Affordable OpenAI frontier work |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 OpenAI mini | 400K | Subagents and lightweight coding |
Frequently asked.
Practical Gemini 2.5 Flash pricing questions, with Google token rates separated from workload assumptions.
Q · 01 What is Gemini 2.5 Flash's standard API price? +
Google lists
gemini-2.5-flash at $0.30/M input, $0.03/M cached input, and $2.50/M output for text, image, and video workloads. Audio input is listed separately at $1/M.Q · 02 Does output pricing include thinking tokens? +
Yes. Google's pricing page labels output as
Output price (including thinking tokens). This page treats generated reasoning and answer tokens as part of the published output rate.Q · 03 How much do Batch and Flex cost? +
Google lists Batch and Flex at $0.15/M input for text, image, and video, $0.50/M audio input, and $1.25/M output. Context caching is $0.03/M for text, image, and video and $0.10/M for audio.
Q · 04 Is this a preview model? +
No. Google's models page lists Gemini 2.5 Flash as a stable Gemini 2.5 Flash model.
Q · 05 What about Google Search grounding costs? +
For Gemini 2.5 models, Google lists free daily grounding allowances and then paid grounded-prompt pricing. Tool charges are separate from token prices and should be budgeted outside token spend.
Q · 06 How accurate is the tokenizer estimate? +
The widget uses
4.0 characters per token as a Gemini planning estimate. Exact billing can differ by language, media inputs, tool calls, and how Google tokenizes multimodal content.