M

Llama 4 Scout - Model Details

Meta
Vision
10M
Context
$0.0800
Input /1M
$0.3000
Output /1M
N/A
Max Output

Llama 4 Scout - Model Details

Overview

Meta's Llama 4 Scout is a mixture-of-experts (MoE) language model that activates 17 billion parameters from a total of 109 billion parameters. The model supports multimodal inputs (text and images) and produces multilingual text and code outputs across 12 languages. It was trained on approximately 40 trillion tokens and uses 16 experts per forward pass.

Key Details:

  • Created: April 5, 2025
  • License: Llama 4 Community License
  • Training Data Cutoff: August 2024
  • Input Modalities: Text, Image
  • Output Modalities: Text
  • Architecture: Mixture-of-Experts with 16 experts per forward pass
  • Maximum Context Length: 10,000,000 tokens (327,680 tokens via LangMart)

Pricing

Provider: DeepInfra

Item Cost
Input Tokens $0.08 per 1M tokens
Output Tokens $0.30 per 1M tokens
Image Tokens $0.0003342 per token
Quantization FP8

Performance

Usage Statistics:

  • Daily requests: 600,000+ requests
  • Average prompt tokens: 1-1.3B daily
  • Average completion tokens: 80-300M daily
  • Use cases: Assistant-style interaction, visual reasoning, multilingual chat, image captioning, visual understanding

Access via LangMart Chat: https://langmart.ai/chat

Model weights available on Hugging Face: meta-llama/Llama-4-Scout-17B-16E-Instruct

Providers

Active Provider: DeepInfra

Parameters

Supported generation parameters:

  • max_tokens
  • temperature
  • top_p
  • stop
  • frequency_penalty
  • presence_penalty
  • repetition_penalty
  • top_k
  • seed
  • min_p
  • response_format