Llama 4 Scout - Model Details

Overview

Meta's Llama 4 Scout is a mixture-of-experts (MoE) language model that activates 17 billion parameters from a total of 109 billion parameters. The model supports multimodal inputs (text and images) and produces multilingual text and code outputs across 12 languages. It was trained on approximately 40 trillion tokens and uses 16 experts per forward pass.

Key Details:

Created: April 5, 2025
License: Llama 4 Community License
Training Data Cutoff: August 2024
Input Modalities: Text, Image
Output Modalities: Text
Architecture: Mixture-of-Experts with 16 experts per forward pass
Maximum Context Length: 10,000,000 tokens (327,680 tokens via LangMart)

Pricing

Provider: DeepInfra

Item	Cost
Input Tokens	$0.08 per 1M tokens
Output Tokens	$0.30 per 1M tokens
Image Tokens	$0.0003342 per token
Quantization	FP8

Performance

Usage Statistics:

Daily requests: 600,000+ requests
Average prompt tokens: 1-1.3B daily
Average completion tokens: 80-300M daily
Use cases: Assistant-style interaction, visual reasoning, multilingual chat, image captioning, visual understanding

Access via LangMart Chat: https://langmart.ai/chat

Model weights available on Hugging Face: meta-llama/Llama-4-Scout-17B-16E-Instruct

Providers

Active Provider: DeepInfra

Base URL: https://api.langmart.ai/v1/openai
Max Completion Tokens: 16,384
Response Format: Supports JSON response format

Parameters

Supported generation parameters:

max_tokens
temperature
top_p
stop
frequency_penalty
presence_penalty
repetition_penalty
top_k
seed
min_p
response_format

Llama 4 Scout - Model Details

Llama 4 Scout - Model Details

Overview

Pricing

Performance

Related Models

Providers

Parameters