G

Google: Gemini Video Understanding

Google
Vision
100K
Context
$0.0500
Input /1M
$0.1500
Output /1M
4K
Max Output

Google: Gemini Video Understanding

Model Overview

Property Value
Model ID google/gemini-video-understanding
Name Gemini Video Understanding
Status Stable
Released 2024-12-01

Description

Video analysis and understanding.

Description

Google: Gemini Video Understanding is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec Value
Context Window 100,000 tokens
Max Output 4,096 tokens
Modalities video, text

Pricing

Type Price
Input $0.05/1M tokens
Output $0.15/1M tokens

Capabilities

  • Text: Yes
  • Image: No
  • Audio: No
  • Video: Yes
  • Tool Use: No
  • JSON Mode: No

Key Features

  1. Multimodal Support - Text, images, audio, and video
  2. Large Context - Up to 100,000 tokens
  3. Tool Use - Not supported
  4. JSON Mode - Not available
  5. Streaming - Real-time generation
  6. Cost Effective - Optimized pricing

Best For

  • Video analysis
  • Content understanding
  • Summarization
  • Scene understanding

Data & Usage Policies

Policy Status
Training Data Not used for training
Prompt Retention Does not retain prompts
Data Processing Google Cloud privacy compliant

Status & Availability

  • Status: STABLE
  • Free Tier: No
  • Provider: Google

API Usage Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "google/gemini-video-understanding",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'
  • google/gemini-3-pro-preview - Latest flagship
  • google/gemini-2.5-pro - Advanced 2.5 model
  • google/gemini-2.0-flash - Fast multimodal
  • google/gemma-3-27b-it - Open-source alternative

Source

Generated for LangMart AI Platform on 2025-12-28