Google: Gemini Video Understanding

Model Overview

Property	Value
Model ID	`google/gemini-video-understanding`
Name	Gemini Video Understanding
Status	Stable
Released	2024-12-01

Description

Video analysis and understanding.

Description

Google: Gemini Video Understanding is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec	Value
Context Window	100,000 tokens
Max Output	4,096 tokens
Modalities	video, text

Pricing

Type	Price
Input	$0.05/1M tokens
Output	$0.15/1M tokens

Capabilities

Text: Yes
Image: No
Audio: No
Video: Yes
Tool Use: No
JSON Mode: No

Key Features

Multimodal Support - Text, images, audio, and video
Large Context - Up to 100,000 tokens
Tool Use - Not supported
JSON Mode - Not available
Streaming - Real-time generation
Cost Effective - Optimized pricing

Best For

Video analysis
Content understanding
Summarization
Scene understanding

Data & Usage Policies

Policy	Status
Training Data	Not used for training
Prompt Retention	Does not retain prompts
Data Processing	Google Cloud privacy compliant

Status & Availability

Status: STABLE
Free Tier: No
Provider: Google

API Usage Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "google/gemini-video-understanding",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

google/gemini-3-pro-preview - Latest flagship
google/gemini-2.5-pro - Advanced 2.5 model
google/gemini-2.0-flash - Fast multimodal
google/gemma-3-27b-it - Open-source alternative

Source

Generated for LangMart AI Platform on 2025-12-28

Google: Gemini Video Understanding

Google: Gemini Video Understanding

Model Overview

Description

Description

Specifications

Pricing

Capabilities

Key Features

Best For

Data & Usage Policies

Status & Availability

API Usage Example

Related Models

Source