Last verified
SHUT DOWN1M CONTEXTARCHIVE PRICEREPLACE WITH GEMINI-2-5-FLASH-LITE

Gemini 1.5 Flash-8B API Pricing

Google's retired small Gemini 1.5 Flash-8B row, kept because it was a very low-cost first-party production model in late 2024: $0.0375/M input, $0.15/M output, with no active cache discount listed. Kept as an archive page for invoice reconciliation and price history.

Input - per 1M tokens
$0.04/M
Tier 1 baseline archive
Output - per 1M tokens
$0.15/M
Tier 2 $0.30/M archive
Cache N/A - billed as input
$0.04/M
No cache price listed N/A
Effective - agentic blend
$0.05/M
92/8 split - 82% cache
§ 01 / TERMINAL

Run the numbers.

Calculator pre-loaded with Gemini 1.5 Flash-8B archive rates. Archive price uses the verified project snapshot: $0.0375/M input and $0.15/M output, with a higher >128K prompt tier at $0.075/M input and $0.30/M output.

$ /mo
Workload split
Prompt cache hit rate
Tokens you can process
Words equivalent (English)
Effective rate
§ 02 / SCENARIOS

Real-world presets.

§ 03 / TAPE

Price history.

Gemini 1.5 Flash-8B is an archive page: endpoint shut down on 2025-09-29.

Input · $0.04/M
Output · $0.15/M
Cached · $0.04/M
OCT 03 Released gemini-1-5-flash-8bSEP 29 Endpoint shut down; replacement is gemini-2-5-flash-liteMAY 18 Archive pricing retained from verified snapshot
§ 04 / TOKENIZER

Paste text. See tokens. See cost.

Estimate · gemini-tokenizer-estimate · ≈3.85 chars/token Auto-counts as you type

This is a chars-per-token approximation, not a real tokenizer. Actual tokens vary by language, code density, and tool-call overhead — counts are typically ±10–20% off for English prose, more for code or non-Latin scripts. For exact billing, use the vendor's official tokenizer.

Characters
Words
Tokens (estimated)
Cost as input · uncached
Cost as output · uncached
Cost as cached input
§ 05 / SHELF

Up against the shelf.

All models →
Model Input /M Output /M Effective blended Context Best for
Gemini 1.5 Flash-8B Current $0.04 $0.15 $0.05 archive 92/8 1M Archive ultra-cheap Gemini invoices
Gemini 2.5 Flash-Lite $0.10 cache $0.01 $0.40 $0.06 stable lite replacement 1M Current low-cost Gemini
Gemini 2.5 Flash $0.30 cache $0.03 $2.50 $0.27 stable flash replacement 1M Current Flash replacement
Gemini 2.5 Pro $1.25 cache $0.13 $10.00 $1.10 stable pro 2M Stable Gemini Pro
GPT-5.4 mini $0.75 cache $0.07 $4.50 $0.54 OpenAI mini 400K Subagents and lightweight coding
DeepSeek V4 Pro $0.43 cache $0.00 $0.87 $0.14 budget frontier 1M Low-cost reasoning alternative

Frequently asked.

Gemini 1.5 Flash-8B pricing questions, with archive status separated from token math.

Q · 01 What is Gemini 1.5 Flash-8B's API price? +
This is an archive price, not a current endpoint recommendation. The page uses $0.0375/M input and $0.15/M output. The archived higher prompt tier is $0.075/M input and $0.30/M output.
Q · 02 Is Gemini 1.5 Flash-8B still available? +
No. Google documentation marks the endpoint as shut down on 2025-09-29. Use gemini-2-5-flash-lite for current traffic.
Q · 03 Does prompt caching apply? +
No active cache price is listed for this retired model. For blended math, this page treats cached input as regular input at $0.0375/M rather than inventing a discount.
Q · 04 How should I use this page? +
Use it for archive pricing, old invoice reconciliation, and generation-to-generation price comparisons. Do not route new production traffic to this model, because the endpoint has already been shut down.
Q · 05 What about Batch and Flex pricing? +
This archive page does not model a current Batch or Flex path because Gemini 1.5 Flash-8B is shut down. Use the recommended replacement for current Batch or Flex pricing.
Q · 06 How accurate is the tokenizer estimate? +
The widget uses 4.0 characters per token as a Gemini planning estimate. Exact billing can differ by language, media inputs, tool calls, and how Google tokenizes multimodal content.