G

Groq: DeepSeek R1 Distill Llama 70B

Groq
Streaming Vision
131K
Context
$0.5900
Input /1M
$0.7900
Output /1M
8K
Max Output

Groq: DeepSeek R1 Distill Llama 70B

Model Overview

Property Value
Model ID groq/deepseek-r1-distill-llama-70b
Name DeepSeek R1 Distill Llama 70B
Parameters 70B

Description

Groq: DeepSeek R1 Distill Llama 70B is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec Value
Context Window 131K tokens
Max Completion 8K tokens
Inference Speed 270 tokens/second

Pricing

Type Price
Input $0.59 per 1M tokens
Output $0.79 per 1M tokens

Capabilities

  • Fast inference engine (Groq LPU)
  • Cost-effective token processing
  • Reliable production performance
  • Streaming support

Limitations

  • 131K token context window
  • Maximum completion tokens: 8K
  • No image generation (inference only)

Performance

Groq specializes in rapid inference with industry-leading token throughput. Typical use cases include:

  • Real-time chat applications
  • Batch processing with guaranteed latency
  • High-volume inference workloads
  • Cost-sensitive deployments

Best Practices

  1. Token Optimization: Craft prompts to minimize token usage while maintaining quality
  2. Streaming: Use streaming responses for real-time applications
  3. Batch Processing: Leverage high TPM limits for batch inference
  4. Context Management: Utilize full context window for complex tasks

Rate Limits

  • 30000 TPM (Tokens Per Minute)
  • Optimized for high-throughput inference

Features

  • High-speed token generation (270 tokens/sec)
  • 131K token context window
  • Suitable for: Distilled reasoning, multi-step problem solving

Integration

Use the standard OpenAI-compatible API endpoint:

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "groq/deepseek-r1-distill-llama-70b",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Resources


Last updated: December 2025 Source: Groq Official Documentation