Last verified
GEMINI 3.1 STABLE1M CONTEXTMULTIMODALCONTEXT CACHINGBATCH + FLEX -50%

Gemini 3.1 Flash-Lite API Pricing

Google's cost-efficient stable Gemini 3.1 tier for high-volume agentic tasks and simple processing: $0.25/M input, $1.5/M output, and $0.025/M cached input. Pulled directly from ai.google.dev.

Input - per 1M tokens
$0.25/M
Text/image/video standard
Output - per 1M tokens
$1.50/M
Includes thinking tokens standard
Cached input - 90% off
$0.03/M
Cache plus storage fee -90%
Effective - agentic blend
$0.18/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Live calculator pre-loaded with current Gemini 3.1 Flash-Lite standard rates. Standard text, image, and video input is $0.25/M; audio input is $0.50/M; output includes thinking tokens.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Gemini 3.1 Flash-Lite is listed at $0.25/M input and $1.5/M output on Google's Gemini API pricing page.

Input · $0.25/M
Output · $1.5/M
Cached · $0.03/M
MAY 07 Released as the GA Gemini 3.1 Flash-Lite model in Google's Gemini API release notesMAY 18 Verified unchanged on Google Gemini API pricing page
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · gemini-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Gemini 3.1 Pro Preview $2.00 cache $0.20 $12.00 $1.44 frontier preview 1M Google frontier preview
Gemini 3 Flash Preview $0.50 cache $0.05 $3.00 $0.36 preview flash 1M Cheaper Gemini 3 preview
Gemini 3.1 Flash-Lite Current $0.25 cache $0.03 $1.50 $0.18 agentic 92/8 1M High-volume Gemini 3.1
Gemini 2.5 Pro $1.25 cache $0.13 $10.00 $1.10 stable pro 2M Stable Gemini 2.5 Pro
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 stable flash 1M Best price-performance Gemini 2.5
Gemini 2.5 Flash-Lite $0.10 cache $0.01 $0.40 $0.06 budget tier 1M Cheapest stable Gemini 2.5
GPT-5.4 $2.50 cache $0.25 $15.00 $1.80 OpenAI competitor 1.05M Affordable OpenAI frontier work
GPT-5.4 mini $0.75 cache $0.07 $4.50 $0.54 OpenAI mini 400K Subagents and lightweight coding

Frequently asked.

Practical Gemini 3.1 Flash-Lite pricing questions, with Google token rates separated from workload assumptions.

Q · 01 What is Gemini 3.1 Flash-Lite's standard API price? +
Google lists gemini-3.1-flash-lite at $0.25/M input, $0.025/M cached input, and $1.50/M output for text, image, and video workloads. Audio input is listed separately at $0.50/M.
Q · 02 Does output pricing include thinking tokens? +
Yes. Google's pricing page labels output as Output price (including thinking tokens). This page treats generated reasoning and answer tokens as part of the published output rate.
Q · 03 How much do Batch and Flex cost? +
Google lists Batch and Flex at $0.125/M input for text, image, and video, $0.25/M audio input, and $0.75/M output. Context caching is $0.0125/M for text, image, and video and $0.025/M for audio.
Q · 04 Is this a preview model? +
No. Google's models page labels Gemini 3.1 Flash-Lite as Stable, while the separate preview row is being deprecated.
Q · 05 What about Google Search grounding costs? +
For Gemini 3 models, Google lists 5,000 grounded prompts per month free, shared across Gemini 3, then $14 / 1,000 search queries. Tool charges are separate from token prices.
Q · 06 How accurate is the tokenizer estimate? +
The widget uses 4.0 characters per token as a Gemini planning estimate. Exact billing can differ by language, media inputs, tool calls, and how Google tokenizes multimodal content.