Last verified 2026-06-08

FRONTIER LABALPHABET SUBSIDIARYGOOGLE DEEPMINDMULTIMODAL NATIVE2M CONTEXT

Gemini API Pricing

Q: Which Gemini model should I start with?

For most production use cases: Gemini 2.5 Pro. It's GA-stable, has the longest context window in the lineup (2M), and is cheaper than the 3.1 preview at $1.25/$10. Move to Gemini 3.1 Pro Preview for newest reasoning + the 3.x preview surface. Move down to Gemini 2.5 Flash for daily traffic, 2.5 Flash-Lite for classification. For coding agents, Gemini 3 Flash Preview hits 78% SWE-bench Verified at a quarter the cost of Pro.

Q: How does Gemini compare to Claude and GPT?

Strongest at long context (2M on 2.5 Pro vs 1M / 200K on rivals), native multimodal (text + image + audio + video all in one model), and productivity-suite integration via Workspace. Weaker than Anthropic on coding-agent benchmarks (78% vs 87.6% on SWE-bench Verified). Less polished consumer chat product than OpenAI's ChatGPT despite the Gemini app's Workspace integration.

Q: Does Google offer EU data residency?

Yes, with a caveat. Vertex AI / Gemini Enterprise Agent Platform exposes EU multi-region data residency at-rest, and Gemini 2.5 Pro + 2.0 Flash are available in europe-west4 (Netherlands) for GDPR-compliant workloads. Gemini 3.x is not yet in EU regions; if you need the newest models, you're still pinned to US / global endpoints, which the Vertex docs warn do not guarantee in-region processing.

Q: What's the data retention policy?

On the paid Gemini API tier: prompts and responses are not used to improve models and are retained for a short period for abuse monitoring (Google does not publish a fixed number for the paid tier; check the latest Gemini API additional terms). On the free tier: prompts and responses are used to improve products. On Vertex / Gemini Enterprise: customer-controlled, no training on inputs, contractual data-residency commitments.

Q: When will Gemini 3 hit general availability?

The 3.x family is still in preview as of May 19, 2026. Cadence so far: Gemini 3 Pro Preview shipped Nov 2025, retired Mar 2026 in favour of 3.1 Pro Preview; Gemini 3 Flash Preview shipped Dec 2025; Gemini 3.1 Flash-Lite hit GA on May 7, 2026. GA for 3.1 Pro is expected later in 2026 once the preview surface stabilises — Google has not announced a date.

Q: Can I get a discount?

Three stackable discounts on the Gemini API: Batch −50% (24-hour async jobs), Flex −50% (best-effort latency), and context caching −90% on cached input tokens. The free tier on AI Studio is the most generous in the industry for development — large request quotas before paid tier kicks in. Vertex side adds committed-use discounts for enterprise contracts.

Q: What about Gemini 1.5 / 2.0?

The Gemini 1.5 family (1.5 Pro, 1.5 Flash, 1.5 Flash-8B) was retired on Sep 29, 2025. Gemini 2.0 Flash + 2.0 Flash-Lite were deprecated as of Feb 18, 2026 and shut down on Jun 1, 2026 — migrate to 2.5 Flash / 2.5 Flash-Lite.

Google DeepMind is the AI division of Alphabet, formed in April 2023 by merging the original DeepMind (founded London, 2010, acquired by Google in 2014) with Google Brain. CEO Demis Hassabis. Ships the Gemini family — the most multimodal frontier lineup on the market (text + image + audio + video native) — through both the Gemini API and the cloud platform now rebranded as Gemini Enterprise Agent Platform (formerly Vertex AI, renamed April 2026).

Open AI Studio →View all models Calculate cost

Production models

5 GA + 5 preview tracked

Founded (parent)

1998

DeepMind merged Apr 2023

SWE-bench leader

78%

Gemini 3 Flash Preview

Cheapest tier

$0.10/M

2.5 Flash-Lite input

Max context

2M tokens

Gemini 2.5 Pro

Last price move

0d ago

Gemini 3.5 Flash re-verified

§ 01 / LINEUP

The full roster.

Side-by-side →

FRONTIER-FLASH

Gemini 3.5 Flash LAUNCHED MAY 2026

GA model released May 19, 2026.

Gemini 3.1 Flash-Lite LAUNCHED MAY 2026

Standard text, image, and video input is $0.25/M; audio input is $0.50/M; output includes thinking tokens.

Gemini 2.5 Pro LAUNCHED JUN 2025

Tiered pricing: $1.25/M input, $0.125/M cached input, and $10/M output for prompts <= 200K tokens; $2.50/$0.25/$15 above 200K.

Gemini 2.5 Flash LAUNCHED JUN 2025

Standard text, image, and video input is $0.30/M; audio input is $1/M; output includes thinking tokens.

Gemini 2.5 Flash-Lite LAUNCHED JUL 2025

Standard text, image, and video input is $0.10/M; audio input is $0.30/M; output includes thinking tokens.

PREVIEW · EARLY ACCESS

Gemini 3.1 Pro Preview LAUNCHED MAY 2026

Preview-stage Gemini 3.1 Pro.

PREVIEW · EARLY ACCESS

Gemini 3.1 Flash-Lite Preview

Preview variant of Flash-Lite 3.1.

PREVIEW · EARLY ACCESS

Gemini 3 Flash Preview LAUNCHED DEC 2025

Standard text, image, and video input is $0.50/M; audio input is $1/M; output includes thinking tokens.

PREVIEW · EARLY ACCESS

Gemini 2.5 Flash-Lite Preview (09-2025)

Dated snapshot of 2.5 Flash-Lite from September 2025 preview drop.

PREVIEW · EARLY ACCESS

Gemini 2.5 Computer Use Preview

Specialised preview for computer-use / browser-agent loops.

documented elsewhere ctx · vision →

DEPRECATED · 2026-02-18

Gemini 2.0 Flash LAUNCHED FEB 2025

Google lists Gemini 2.0 Flash at $0.10/M input for text, image, and video, $0.70/M audio input, $0.40/M output, and $0.025/M cached text/image/video input.

DEPRECATED · 2026-02-18

Gemini 2.0 Flash-Lite LAUNCHED FEB 2025

Google lists Gemini 2.0 Flash-Lite at $0.075/M input and $0.30/M output.

Gemini 3 Pro Preview LAUNCHED NOV 2025

Archive price mirrors the Gemini 3 Pro Preview list rate preserved in the project snapshot: $2/M input, $0.20/M cached input, and $12/M output for prompts up to 200K tokens; tier 2 above 200K was $…

Gemini 1.5 Flash-8B LAUNCHED OCT 2024

Half-price variant of Gemini 1.5 Flash — at $0.0375 input this was the cheapest production-grade model on any first-party API in late 2024.

Gemini 1.5 Pro LAUNCHED MAY 2024

Archive price uses the last-published project snapshot: $1.25/M input and $5/M output, with a higher >128K prompt tier at $2.50/M input and $10/M output.

Gemini 1.5 Flash LAUNCHED MAY 2024

Archive price uses the last-published project snapshot: $0.075/M input and $0.30/M output, with a higher >128K prompt tier at $0.15/M input and $0.60/M output.

Gemini 1.0 Pro LAUNCHED DEC 2023

Google's first general-availability Gemini API model — released December 13, 2023 as part of the original Gemini 1.0 family (alongside Ultra and Nano).

32K ctx · text-only →

§ 02 / SHELF

All side-by-side.

Methodology →

Model	Input /M	Output /M	Cached	Context	Max output	Vision	Tools	Tier
Gemini 3.5 Flash FLAGSHIP	$1.50	$9.00	$0.15−90%	1M	—	✓	✓	Active
Gemini 3.1 Flash-Lite	$0.25	$1.50	$0.03−90%	1M	—	✓	✓	Light
Gemini 2.5 Pro	$1.25	$10	$0.13−90%	2M	—	✓	✓	Frontier
Gemini 2.5 Flash	$0.3	$2.50	$0.03−90%	1M	—	✓	✓	Mid
Gemini 2.5 Flash-Lite	$0.1	$0.4	$0.01−90%	1M	—	✓	✓	Light
Gemini 3.1 Pro Preview PREVIEW	$2.00	$12	$0.2−90%	1M	—	✓	✓	Preview
Gemini 3.1 Flash-Lite Preview PREVIEW	$0.25	$1.50	$0.03−90%	1M	—	✓	✓	Preview
Gemini 3 Flash Preview PREVIEW	$0.5	$3.00	$0.05−90%	1M	—	✓	✓	Preview
Gemini 2.5 Flash-Lite Preview (09-2025) PREVIEW	$0.1	$0.4	$0.01−90%	1M	—	✓	✓	Preview
Gemini 2.5 Computer Use Preview PREVIEW	$1.25	$10	—	documented elsewhere	—	✓	✓	Preview
Gemini 2.0 Flash DEPRECATED	$0.1	$0.4	$0.03−75%	1M	—	✓	✓	Deprecated
Gemini 2.0 Flash-Lite DEPRECATED	$0.07	$0.3	—	1M	—	✓	✓	Deprecated
Gemini 3 Pro Preview RETIRED	—	—	—	Retired Mar 9, 2026	→ gemini-3-1-pro-preview	✓	✗	Retired
Gemini 1.5 Flash-8B RETIRED	—	—	—	Retired Sep 29, 2025	→ gemini-2-5-flash-lite	✓	✗	Retired
Gemini 1.5 Pro RETIRED	—	—	—	Retired Sep 29, 2025	→ gemini-2-5-pro	✓	✗	Retired
Gemini 1.5 Flash RETIRED	—	—	—	Retired Sep 29, 2025	→ gemini-2-5-flash	✓	✗	Retired
Gemini 1.0 Pro RETIRED	—	—	—	Retired Feb 18, 2025	→ gemini-2-5-flash	✗	✗	Retired

§ 03 / PRICE CURVE

Pricing across the lineup.

How Google priced 14 models · DEC 23 → MAY 26.

oldest → newest →

Input · newest $1.5/M

Output · newest $9/M

Each point is a model at its listed $/M price.

The last 12 months at Google DeepMind.

Major releases, deprecations, and corporate moves. Sourced from ai.google.dev changelog, Google DeepMind blog, and recent press.

MAY 19 · 2026

Gemini 3.5 Flash released GA — sustained frontier performance for agentic and coding tasks, currently listed at $1.50/$9.00

RELEASE

MAY 07 · 2026

Gemini 3.1 Pro Preview + Gemini 3.1 Flash-Lite released — new preview surface for the 3.x line

RELEASE

APR 23 · 2026

Vertex AI rebranded as Gemini Enterprise Agent Platform — full agent stack, RAG, and security controls under one product

CORPORATE

MAR 09 · 2026

Gemini 3 Pro Preview retired — superseded by Gemini 3.1 Pro Preview ($2/$12 stays the same)

PRICING

MAR · 2026

Lyria 3 Pro released — extended text-to-music track creation

RELEASE

FEB 18 · 2026

Gemini 2.0 Flash + Flash-Lite deprecated — full retirement completed on Jun 1, 2026

PRICING

FEB · 2026

Lyria 3 launched — first text-to-music model in the family

RELEASE

DEC 17 · 2025

Gemini 3 Flash Preview released — 78% on SWE-bench Verified, beats Gemini 3 Pro on agentic coding at a quarter the price

RELEASE

NOV 18 · 2025

Gemini 3 Pro Preview released — first fully multimodal reasoning Pro tier

RELEASE

SEP 29 · 2025

Gemini 1.5 family retired — 1.5 Pro, 1.5 Flash, 1.5 Flash-8B shut down

PRICING

SEP · 2025

Gemini Robotics 1.5 released — embodied AI improvements for physical interaction

RELEASE

JUL 22 · 2025

Gemini 2.5 Flash-Lite GA — cheapest GA tier on the Gemini API at $0.10/$0.40

RELEASE

JUN 17 · 2025

Gemini 2.5 Pro + Flash GA — production-grade reasoning with 2M context on Pro

RELEASE

MAY · 2025

Veo 3 launched — video generation with synchronised audio

RELEASE

§ 04 / ACCESS

Where to get it.

Methodology →

PRIMARY · DIRECT

Gemini API + AI Studio

aistudio.google.com and ai.google.dev — direct API access, free-tier playground, batch + flex tiers. Free quota for development; production runs on the paid pay-as-you-go rates shown here.

Free + paid tiers →

CLOUD · GCP

Gemini Enterprise Agent Platform

Renamed from Vertex AI on April 23, 2026. Full agent stack — RAG, security command center, agent builder. Data residency at-rest available in US / EU / global multi-regions. Gemini 3.x is not yet in EU regions; 2.5 Pro and 2.0 Flash are available in europe-west4.

GCP · regional + global →

CONSUMER · WEB

Gemini app

gemini.google.com — end-user product (free + Advanced subscription). Includes Workspace integration, Deep Research, image gen via Imagen, video gen via Veo. No API access.

Free + Advanced $20/mo →

PRODUCTIVITY · WORKSPACE

Workspace integration

Native Gemini inside Docs, Sheets, Slides, Gmail, Meet. Enterprise plans get data-boundary guarantees and admin controls. Billed per seat, not per token.

Per-seat enterprise →

§ 04 / BEST FOR

Which Google for what.

More scenarios →

If you want the newest GA Gemini model for coding agents…

Gemini 3.5 Flash

Profile →

If you need the longest context window on a frontier model…

Gemini 2.5 Pro (2M)

Profile →

If you want frontier reasoning + multimodal in preview…

Gemini 3.1 Pro Preview

Profile →

If you need SWE-bench-class agentic coding at mid-tier cost…

Gemini 3 Flash Preview

Profile →

If you want a stable daily driver with thinking tokens…

Gemini 2.5 Flash

Profile →

If you're doing high-volume classification on the cheap…

Gemini 2.5 Flash-Lite

Profile →

If you're building a browser-agent / computer-use loop…

Gemini 2.5 Computer Use Preview

Profile →

If you need EU data residency for compliance…

Gemini 2.5 Pro on Vertex europe-west4

Vertex docs →

§ 06 / BACKGROUND

The company behind it.

deepmind.google →

Google DeepMind is the AI research and product division of Alphabet Inc. (the holding company spun out from Google in 2015). It was formed in April 2023 by merging the original DeepMind — founded in London in November 2010 by Demis Hassabis, Shane Legg, and Mustafa Suleyman, acquired by Google for $400–650M in January 2014 — with the previously separate Google Brain team inside Google. CEO: Demis Hassabis.

The merger ended a decade of internal duality. Pre-merger, Google ran two distinct AI orgs: Google Brain (Mountain View, headed by Jeff Dean) shipped TensorFlow, the original Transformer paper, and most of Google's product AI; DeepMind (London, headed by Hassabis) produced AlphaGo, AlphaFold, and most of the lab's public scientific breakthroughs. The merge consolidated leadership and unblocked the shipping cadence that produced the Gemini family.

Major technical contributions: AlphaGo / AlphaZero / MuZero (game-playing systems that beat world champions at Go, chess, shogi); AlphaFold (protein structure prediction — 2024 Nobel Prize in Chemistry for Hassabis and John Jumper); the original Transformer architecture (Vaswani et al., 2017, Google Brain); Gemini (multimodal LLM family, first released Dec 6, 2023); Imagen (image gen), Veo (video gen), and Lyria (music gen).

As an Alphabet subsidiary, Google DeepMind benefits from Alphabet's full revenue base (~$350B annual, 2025) and Google Cloud's compute footprint. It does not raise external venture funding the way Anthropic and OpenAI do — capital and TPU access come from inside the org. The catch: AI products compete for runway against Google Search, Google Cloud, and YouTube within the same P&L; product velocity has historically been slower than smaller pure-play labs.

Compared to OpenAI and Anthropic: the deepest multimodal stack (text + image + audio + video natively in one model), the longest context window on a flagship (2M on Gemini 2.5 Pro), and the broadest distribution via Workspace + Gemini app + Android. Weaker on coding-agent benchmarks (Anthropic still leads SWE-bench Verified; Gemini 3 Flash hits 78% vs Claude Opus 4.7's 87.6%) and on consumer brand recognition for ChatGPT-style chat.

§ 07 / COMPETITORS

Other frontier labs.

All providers →

PRIMARY RIVAL

OpenAI

Bigger consumer brand via ChatGPT, deeper Microsoft / Azure integration, faster product cadence on the API. Behind Google on max context window (1M vs 2M) and on native video / audio modalities.

GPT-5 family →

CODING SPECIALIST

Anthropic

Leads SWE-bench Verified at 87.6% with Claude Opus 4.7 — Google's best is Gemini 3 Flash at 78%. Stronger on zero-retention defaults and Responsible Scaling Policy. Smaller multimodal surface.

Claude family →

WILDCARD

xAI

X / Twitter native integration and Memphis Colossus supercluster. Smaller multimodal surface and shorter context windows than Gemini; ecosystem is the moat or the bottleneck depending on your stack.

Grok 4 family →

EU CHAMPION

Mistral

EU sovereignty and GDPR-native by construction. Stronger French / German support. Lower frontier benchmarks but more transparent provenance; viable on EU compliance contracts where Vertex still lacks Gemini 3 in EU regions.

Mistral Large 3 →

OPEN-WEIGHT LEADER

Alibaba (Qwen)

The Qwen (Tongyi Qianwen) model family from Alibaba Cloud's Tongyi Lab. First released in 2023, Qwen is the most-downloaded open-weight model family in the world — most tiers ship under Apache 2.0 on Hugging Face and ModelScope, while the proprietary Max tier is served through Alibaba Cloud's Model Studio.

28 models · from $0.05/M →

Frequently asked.

Practical questions about Google DeepMind, the Gemini lineup, and Vertex / Gemini Enterprise Agent Platform.

Q · 01 Which Gemini model should I start with? +

For most production use cases: Gemini 2.5 Pro. It's GA-stable, has the longest context window in the lineup (2M), and is cheaper than the 3.1 preview at $1.25/$10. Move to Gemini 3.1 Pro Preview for newest reasoning + the 3.x preview surface. Move down to Gemini 2.5 Flash for daily traffic, 2.5 Flash-Lite for classification. For coding agents, Gemini 3 Flash Preview hits 78% SWE-bench Verified at a quarter the cost of Pro.

Q · 02 How does Gemini compare to Claude and GPT? +

Strongest at long context (2M on 2.5 Pro vs 1M / 200K on rivals), native multimodal (text + image + audio + video all in one model), and productivity-suite integration via Workspace. Weaker than Anthropic on coding-agent benchmarks (78% vs 87.6% on SWE-bench Verified). Less polished consumer chat product than OpenAI's ChatGPT despite the Gemini app's Workspace integration.

Q · 03 Does Google offer EU data residency? +

Yes, with a caveat. Vertex AI / Gemini Enterprise Agent Platform exposes EU multi-region data residency at-rest, and Gemini 2.5 Pro + 2.0 Flash are available in europe-west4 (Netherlands) for GDPR-compliant workloads. Gemini 3.x is not yet in EU regions; if you need the newest models, you're still pinned to US / global endpoints, which the Vertex docs warn do not guarantee in-region processing.

Q · 04 What's the data retention policy? +

On the paid Gemini API tier: prompts and responses are not used to improve models and are retained for a short period for abuse monitoring (Google does not publish a fixed number for the paid tier; check the latest Gemini API additional terms). On the free tier: prompts and responses are used to improve products. On Vertex / Gemini Enterprise: customer-controlled, no training on inputs, contractual data-residency commitments.

Q · 05 When will Gemini 3 hit general availability? +

The 3.x family is still in preview as of May 19, 2026. Cadence so far: Gemini 3 Pro Preview shipped Nov 2025, retired Mar 2026 in favour of 3.1 Pro Preview; Gemini 3 Flash Preview shipped Dec 2025; Gemini 3.1 Flash-Lite hit GA on May 7, 2026. GA for 3.1 Pro is expected later in 2026 once the preview surface stabilises — Google has not announced a date.

Q · 06 Can I get a discount? +

Three stackable discounts on the Gemini API: Batch −50% (24-hour async jobs), Flex −50% (best-effort latency), and context caching −90% on cached input tokens. The free tier on AI Studio is the most generous in the industry for development — large request quotas before paid tier kicks in. Vertex side adds committed-use discounts for enterprise contracts.

Q · 07 What about Gemini 1.5 / 2.0? +

The Gemini 1.5 family (1.5 Pro, 1.5 Flash, 1.5 Flash-8B) was retired on Sep 29, 2025. Gemini 2.0 Flash + 2.0 Flash-Lite were deprecated as of Feb 18, 2026 and shut down on Jun 1, 2026 — migrate to 2.5 Flash / 2.5 Flash-Lite.

Reviewed by Yaroslav Vikhariev Founder · AI//COST · All Gemini models tested · Pricing pulled daily from ai.google.dev

Methodology Report a correction More by Y.V.

Gemini API Pricing

The full roster.

Gemini 3.5 Flash LAUNCHED MAY 2026

Gemini 3.1 Flash-Lite LAUNCHED MAY 2026

Gemini 2.5 Pro LAUNCHED JUN 2025

Gemini 2.5 Flash LAUNCHED JUN 2025

Gemini 2.5 Flash-Lite LAUNCHED JUL 2025

Gemini 3.1 Pro Preview LAUNCHED MAY 2026

Gemini 3.1 Flash-Lite Preview

Gemini 3 Flash Preview LAUNCHED DEC 2025

Gemini 2.5 Flash-Lite Preview (09-2025)

Gemini 2.5 Computer Use Preview

Gemini 2.0 Flash LAUNCHED FEB 2025

Gemini 2.0 Flash-Lite LAUNCHED FEB 2025

Gemini 3 Pro Preview LAUNCHED NOV 2025

Gemini 1.5 Flash-8B LAUNCHED OCT 2024

Gemini 1.5 Pro LAUNCHED MAY 2024

Gemini 1.5 Flash LAUNCHED MAY 2024

Gemini 1.0 Pro LAUNCHED DEC 2023

All side-by-side.

Pricing across the lineup.

How Google priced 14 models · DEC 23 → MAY 26.

The last 12 months at Google DeepMind.

Where to get it.

Which Google for what.

The company behind it.

Other frontier labs.

OpenAI

Anthropic

xAI

Mistral

Meta

Alibaba (Qwen)

Frequently asked.