Google Gemini 2.0 Flash Lite - Complete Model Details

Overview

Model Name: Google: Gemini 2.0 Flash Lite
Model ID: google/gemini-2.0-flash-lite-001
Provider: Google Vertex
Created: February 25, 2025
Context Length: 1,048,576 tokens

Description: This model emphasizes "significantly faster time to first token (TTFT) compared to Gemini Flash 1.5" while matching quality expectations of larger variants, at reduced costs.

Pricing

Metric	Cost
Input Tokens	$0.075 per million
Output Tokens	$0.30 per million

Detailed Analysis

Gemini 2.0 Flash-Lite-001 is a stable, numbered release version providing production-ready performance with guaranteed behavior. This specific version ensures consistency across deployments. Why choose over 2.0 Flash: Flash-Lite-001 offers the same cost and latency advantages as the auto-updated Flash-Lite alias but with version pinning for production stability. It excels at high-throughput, structured tasks where you need predictable model behavior across time. Use this over 2.0 Flash when extreme speed and cost efficiency matter more than advanced reasoning capabilities. Stable vs Preview vs Auto-Updated: The -001 suffix indicates this is a numbered stable release, recommended for production use. Unlike the auto-updated alias (gemini-2.0-flash-lite), this version won't change unexpectedly, preventing behavior shifts in production systems. Preview versions may have billing enabled but can be deprecated with 2 weeks notice. 2.0 vs 2.5 generation: As a 2.0-generation model, Flash-Lite-001 represents the first iteration of Google's Lite optimization. Gemini 2.5 Flash-Lite (newer generation) delivers all-around higher quality with better instruction following, reduced verbosity, and stronger multimodal capabilities. The 2.5 generation uses 50% fewer output tokens, making it more cost-effective despite the same pricing. Best for: Production deployments requiring version stability, high-volume classification tasks, content moderation at scale, real-time routing decisions, and batch extraction jobs where output determinism is critical.

Gemini Flash 1.5
Gemini Pro 1.5
Other Google Gemini variants via LangMart

Providers

Platform: Google Vertex
Adapter: GoogleVertexGeminiAdapter
Data Policy:

No training on user data
No prompt retention
Requires User IDs
Terms: https://cloud.google.com/terms/

Performance Metrics

Recent Usage (December 2025):

Date	Requests	Prompt Tokens	Completion Tokens	Tool Calls
Dec 23	4.5M	4.07B	446M	29K
Dec 22	3.5M	4.41B	667M	31K
Dec 6	8.4M	6.70B	1.26B	143K

Parameters

Input Modalities:

Text
Image
File
Audio
Video

Output Modalities:

Text only

Supported Parameters:

max_tokens
temperature
top_p
seed
response_format
stop
structured_outputs
tools
tool_choice

Features:

File URL support
Input audio capability
Base64 video input
Tool choice support (none, auto, required)
Multipart message support
Chat completions

Google Gemini 2.0 Flash Lite - Complete Model Details

Google Gemini 2.0 Flash Lite - Complete Model Details

Overview

Pricing

Detailed Analysis

Related Models

Providers

Performance Metrics

Parameters