Llama 4 Scout - Model Details
Overview
Meta's Llama 4 Scout is a mixture-of-experts (MoE) language model that activates 17 billion parameters from a total of 109 billion parameters. The model supports multimodal inputs (text and images) and produces multilingual text and code outputs across 12 languages. It was trained on approximately 40 trillion tokens and uses 16 experts per forward pass.
Key Details:
- Created: April 5, 2025
- License: Llama 4 Community License
- Training Data Cutoff: August 2024
- Input Modalities: Text, Image
- Output Modalities: Text
- Architecture: Mixture-of-Experts with 16 experts per forward pass
- Maximum Context Length: 10,000,000 tokens (327,680 tokens via LangMart)
Pricing
Provider: DeepInfra
| Item | Cost |
|---|---|
| Input Tokens | $0.08 per 1M tokens |
| Output Tokens | $0.30 per 1M tokens |
| Image Tokens | $0.0003342 per token |
| Quantization | FP8 |
Performance
Usage Statistics:
- Daily requests: 600,000+ requests
- Average prompt tokens: 1-1.3B daily
- Average completion tokens: 80-300M daily
- Use cases: Assistant-style interaction, visual reasoning, multilingual chat, image captioning, visual understanding
Related Models
Access via LangMart Chat: https://langmart.ai/chat
Model weights available on Hugging Face: meta-llama/Llama-4-Scout-17B-16E-Instruct
Providers
Active Provider: DeepInfra
- Base URL: https://api.langmart.ai/v1/openai
- Max Completion Tokens: 16,384
- Response Format: Supports JSON response format
Parameters
Supported generation parameters:
- max_tokens
- temperature
- top_p
- stop
- frequency_penalty
- presence_penalty
- repetition_penalty
- top_k
- seed
- min_p
- response_format