GEMINI 3 PREVIEW1M CONTEXTMULTIMODALCONTEXT CACHINGBATCH + FLEX -50%
Gemini 3 Flash Preview API Pricing
Google's cheaper Gemini 3 preview tier for fast multimodal and agentic workloads: $0.5/M input, $3/M output, and $0.05/M cached input. Pulled directly from ai.google.dev.
Input - per 1M tokens
$0.50/M
Text/image/video standard
Output - per 1M tokens
$3.00/M
Includes thinking tokens standard
Cached input - 90% off
$0.05/M
Cache plus storage fee -90%
Effective - agentic blend
$0.36/M
92/8 split - 82% cache
§ 01 / TERMINAL
Run the numbers.
Live calculator pre-loaded with current Gemini 3 Flash Preview standard rates. Standard text, image, and video input is $0.50/M; audio input is $1/M; output includes thinking tokens.
$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
—
Words equivalent (English)
—
Effective rate
—
§ 02 / SCENARIOS
Real-world presets.
CODING
Coding agent iteration
$0.064/task
LONG CONTEXT
Document pack analysis
$0.115/pack
MULTIMODAL
Video + docs briefing
$0.024/brief
AGENT
Research agent loop
$0.090/loop
RAG
Large RAG synthesis
$0.111/answer
§ 03 / TAPE
Price history.
Input · $0.50/M
Output · $3/M
Cached · $0.05/M
DEC 17 Launched as Gemini 3 Flash Preview on Google's Gemini API release notesMAY 18 Verified unchanged on Google Gemini API pricing page
§ 04 / TOKENIZER
Paste text. See tokens. See cost.
Estimate · gemini-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type
This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.
Characters —
Words —
Tokens (estimated) —
Cost as input · uncached —
Cost as output · uncached —
Cost as cached input —
| Model | Input /M | Output /M | Effective blended | Context | Best for |
|---|---|---|---|---|---|
| Gemini 3.1 Pro Preview | $2.00 cache $0.20 | $12.00 | $1.44 frontier preview | 1M | Google frontier preview |
| Gemini 3 Flash Preview Current | $0.50 cache $0.05 | $3.00 | $0.36 agentic 92/8 | 1M | Cheaper Gemini 3 preview |
| Gemini 3.1 Flash-Lite | $0.25 cache $0.03 | $1.50 | $0.18 light tier | 1M | High-volume Gemini 3.1 |
| Gemini 2.5 Pro | $1.25 cache $0.13 | $10.00 | $1.10 stable pro | 2M | Stable Gemini 2.5 Pro |
| Gemini 2.5 Flash | $0.30 cache $0.03 | $2.50 | $0.27 stable flash | 1M | Best price-performance Gemini 2.5 |
| Gemini 2.5 Flash-Lite | $0.10 cache $0.01 | $0.40 | $0.06 budget tier | 1M | Cheapest stable Gemini 2.5 |
| GPT-5.4 | $2.50 cache $0.25 | $15.00 | $1.80 OpenAI competitor | 1.05M | Affordable OpenAI frontier work |
| GPT-5.4 mini | $0.75 cache $0.07 | $4.50 | $0.54 OpenAI mini | 400K | Subagents and lightweight coding |
Frequently asked.
Practical Gemini 3 Flash Preview pricing questions, with Google token rates separated from workload assumptions.
Q · 01 What is Gemini 3 Flash Preview's standard API price? +
Google lists
gemini-3-flash-preview at $0.50/M input, $0.05/M cached input, and $3/M output for text, image, and video workloads. Audio input is listed separately at $1/M.Q · 02 Does output pricing include thinking tokens? +
Yes. Google's pricing page labels output as
Output price (including thinking tokens). This page treats generated reasoning and answer tokens as part of the published output rate.Q · 03 How much do Batch and Flex cost? +
Google lists Batch and Flex at $0.25/M input for text, image, and video, $0.50/M audio input, and $1.50/M output. Context caching is $0.05/M for text, image, and video and $0.10/M for audio.
Q · 04 Is this a preview model? +
Yes. Google's pricing page labels this row as a preview model, so limits and behavior can change before a stable release.
Q · 05 What about Google Search grounding costs? +
For Gemini 3 models, Google lists 5,000 grounded prompts per month free, shared across Gemini 3, then
$14 / 1,000 search queries. Tool charges are separate from token prices.Q · 06 How accurate is the tokenizer estimate? +
The widget uses
4.0 characters per token as a Gemini planning estimate. Exact billing can differ by language, media inputs, tool calls, and how Google tokenizes multimodal content.