G

Google: Gemini Multimodal Live

Google
Vision Tools
100K
Context
$0.5000
Input /1M
$1.50
Output /1M
8K
Max Output

Google: Gemini Multimodal Live

Model Overview

Property Value
Model ID google/gemini-multimodal-live
Name Gemini Multimodal Live
Status Experimental
Released 2025-11-15

Description

Real-time streaming multimodal model.

Description

Google: Gemini Multimodal Live is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec Value
Context Window 100,000 tokens
Max Output 8,000 tokens
Modalities text, image, audio, video, stream

Pricing

Type Price
Input $0.5/1M tokens
Output $1.5/1M tokens

Capabilities

  • Text: Yes
  • Image: Yes
  • Audio: Yes
  • Video: Yes
  • Tool Use: Yes
  • JSON Mode: Yes

Key Features

  1. Multimodal Support - Text, images, audio, and video
  2. Large Context - Up to 100,000 tokens
  3. Tool Use - Supported
  4. JSON Mode - Supported
  5. Streaming - Real-time generation
  6. Cost Effective - Optimized pricing

Best For

  • Live streaming
  • Real-time analysis
  • Interactive applications
  • Live transcription

Data & Usage Policies

Policy Status
Training Data Not used for training
Prompt Retention Does not retain prompts
Data Processing Google Cloud privacy compliant

Status & Availability

  • Status: EXPERIMENTAL
  • Free Tier: No
  • Provider: Google

API Usage Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "google/gemini-multimodal-live",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 8000
  }'
  • google/gemini-3-pro-preview - Latest flagship
  • google/gemini-2.5-pro - Advanced 2.5 model
  • google/gemini-2.0-flash - Fast multimodal
  • google/gemma-3-27b-it - Open-source alternative

Source

Generated for LangMart AI Platform on 2025-12-28