Groq: Llama 4 Maverick 17B 128E Instruct
Model Overview
| Property | Value |
|---|---|
| Model ID | groq/meta-llama/llama-4-maverick-17b-128e-instruct |
| Name | Llama 4 Maverick 17B 128E Instruct |
| Provider | Groq / Meta |
| Parameters | 17B |
Description
Meta's Llama 4 Maverick model with 17 billion parameters and 128 experts, optimized for instruction following. Hosted on Groq for ultra-fast inference speeds.
Specifications
| Spec | Value |
|---|---|
| Context Window | 131,072 tokens |
| Max Completion | 8,192 tokens |
| Inference Speed | ~600 tokens/sec |
Pricing
| Type | Price |
|---|---|
| Input | $0.05 per 1M tokens |
| Output | $0.15 per 1M tokens |
Capabilities
- Instruction Following: Yes
- Fast Inference: Yes
- Streaming: Yes
- MoE Architecture: Yes
Use Cases
General-purpose instruction following, production workloads, high-throughput applications.
Integration with LangMart
Gateway Support: Type 2 (Cloud), Type 3 (Self-hosted)
API Usage:
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "groq/meta-llama/llama-4-maverick-17b-128e-instruct",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 4096
}'
Related Models
- groq/meta-llama/llama-4-scout-17b-16e-instruct - Scout variant
- groq/llama-3.3-70b-versatile - Previous generation
Last Updated: December 28, 2025