Groq: GPT-4o Mini
Model Overview
| Property | Value |
|---|---|
| Model ID | groq/gpt-4o-mini |
| Name | GPT-4o Mini |
| Parameters | Unknown |
Description
Groq: GPT-4o Mini is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.
Specifications
| Spec | Value |
|---|---|
| Context Window | 128K tokens |
| Max Completion | 4K tokens |
| Inference Speed | 200 tokens/second |
Pricing
| Type | Price |
|---|---|
| Input | $0.15 per 1M tokens |
| Output | $0.60 per 1M tokens |
Capabilities
- Fast inference engine (Groq LPU)
- Cost-effective token processing
- Reliable production performance
- Streaming support
Limitations
- 128K token context window
- Maximum completion tokens: 4K
- No image generation (inference only)
Performance
Groq specializes in rapid inference with industry-leading token throughput. Typical use cases include:
- Real-time chat applications
- Batch processing with guaranteed latency
- High-volume inference workloads
- Cost-sensitive deployments
Best Practices
- Token Optimization: Craft prompts to minimize token usage while maintaining quality
- Streaming: Use streaming responses for real-time applications
- Batch Processing: Leverage high TPM limits for batch inference
- Context Management: Utilize full context window for complex tasks
Rate Limits
- 30000 TPM (Tokens Per Minute)
- Optimized for high-throughput inference
Features
- High-speed token generation (200 tokens/sec)
- 128K token context window
- Suitable for: Cost-effective, general purpose, ChatGPT API access
Integration
Use the standard OpenAI-compatible API endpoint:
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer $GROQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "groq/gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
Resources
Last updated: December 2025 Source: Groq Official Documentation