G

Google Gemini 2.0 Flash Lite - Complete Model Details

Google
Vision
1M
Context
$0.0750
Input /1M
$0.3000
Output /1M
N/A
Max Output

Google Gemini 2.0 Flash Lite - Complete Model Details

Overview

Model Name: Google: Gemini 2.0 Flash Lite
Model ID: google/gemini-2.0-flash-lite-001
Provider: Google Vertex
Created: February 25, 2025
Context Length: 1,048,576 tokens

Description: This model emphasizes "significantly faster time to first token (TTFT) compared to Gemini Flash 1.5" while matching quality expectations of larger variants, at reduced costs.


Pricing

Metric Cost
Input Tokens $0.075 per million
Output Tokens $0.30 per million

Detailed Analysis

Gemini 2.0 Flash-Lite-001 is a stable, numbered release version providing production-ready performance with guaranteed behavior. This specific version ensures consistency across deployments. Why choose over 2.0 Flash: Flash-Lite-001 offers the same cost and latency advantages as the auto-updated Flash-Lite alias but with version pinning for production stability. It excels at high-throughput, structured tasks where you need predictable model behavior across time. Use this over 2.0 Flash when extreme speed and cost efficiency matter more than advanced reasoning capabilities. Stable vs Preview vs Auto-Updated: The -001 suffix indicates this is a numbered stable release, recommended for production use. Unlike the auto-updated alias (gemini-2.0-flash-lite), this version won't change unexpectedly, preventing behavior shifts in production systems. Preview versions may have billing enabled but can be deprecated with 2 weeks notice. 2.0 vs 2.5 generation: As a 2.0-generation model, Flash-Lite-001 represents the first iteration of Google's Lite optimization. Gemini 2.5 Flash-Lite (newer generation) delivers all-around higher quality with better instruction following, reduced verbosity, and stronger multimodal capabilities. The 2.5 generation uses 50% fewer output tokens, making it more cost-effective despite the same pricing. Best for: Production deployments requiring version stability, high-volume classification tasks, content moderation at scale, real-time routing decisions, and batch extraction jobs where output determinism is critical.

  • Gemini Flash 1.5
  • Gemini Pro 1.5
  • Other Google Gemini variants via LangMart

Providers

Platform: Google Vertex
Adapter: GoogleVertexGeminiAdapter
Data Policy:


Performance Metrics

Recent Usage (December 2025):

Date Requests Prompt Tokens Completion Tokens Tool Calls
Dec 23 4.5M 4.07B 446M 29K
Dec 22 3.5M 4.41B 667M 31K
Dec 6 8.4M 6.70B 1.26B 143K

Parameters

Input Modalities:

  • Text
  • Image
  • File
  • Audio
  • Video

Output Modalities:

  • Text only

Supported Parameters:

  • max_tokens
  • temperature
  • top_p
  • seed
  • response_format
  • stop
  • structured_outputs
  • tools
  • tool_choice

Features:

  • File URL support
  • Input audio capability
  • Base64 video input
  • Tool choice support (none, auto, required)
  • Multipart message support
  • Chat completions