Qwen 2.5 72B Instruct
Model ID: qwen/qwen-2.5-72b-instruct
Provider: Qwen - Alibaba's Qwen family
Canonical Slug: qwen/qwen-2.5-72b-instruct
Overview
Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:
Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains.
Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.
Long-context Support up to 128K tokens and can generate up to 8K tokens.
Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.
Specifications
| Specification | Value |
|---|---|
| Context Window | 32,768 tokens |
| Max Output Tokens | Unknown |
| Modality | text->text |
| Model Architecture | text to text |
| Release Date | 1726704000 |
Pricing
| Metric | Price |
|---|---|
| Prompt Cost | $0.00000012 per 1M tokens |
| Completion Cost | $0.00000039 per 1M tokens |
| Currency | USD |
Capabilities
- Text Generation
- Instruction Following
Supported Parameters
The model supports the following parameters in API requests:
- temperature: Controls randomness (0.0 - 2.0), default: 1.0
- top_p: Nucleus sampling (0.0 - 1.0), default: 1.0
- top_k: Top-k filtering
- frequency_penalty: Reduces repetition (-2.0 to 2.0)
- presence_penalty: Encourages new topics (-2.0 to 2.0)
- repetition_penalty: Alternative repetition control (0.5 - 2.0)
- stop: Stop sequences
- seed: Random seed for reproducibility
- max_tokens: Maximum output length
API Usage Example
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen-2.5-72b-instruct",
"messages": [
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
],
"temperature": 1.0,
"max_tokens": 2048,
"top_p": 1.0
}'
Performance Metrics
Speed & Quality Tradeoff
- Inference Speed: Fast
- Quality Tier: Advanced
- Cost Efficiency: Optimized for production
Recommended Use Cases
- Long-form text generation
- Code generation and analysis
- Conversational AI
- Complex reasoning tasks
- Information synthesis
Related & Alternative Models
From Same Provider
- qwen/qwen3-max
- qwen/qwen3-coder-plus
- qwen/qwen-2.5-72b-instruct
- qwen/qwen-2.5-32b-instruct
- qwen/qwen-2.5-14b-instruct
Comparable Models from Other Providers
- OpenAI: GPT-4 Turbo, GPT-4o
- Anthropic: Claude 3.5 Sonnet
- Google: Gemini 2.0 Flash
- DeepSeek: DeepSeek-R1
Python Integration
import anthropic
client = anthropic.Anthropic(
api_key="YOUR_API_KEY",
base_url="https://api.langmart.ai/v1"
)
message = client.messages.create(
model="qwen/qwen-2.5-72b-instruct",
max_tokens=2048,
messages=[
{
"role": "user",
"content": "Your prompt here"
}
]
)
print(message.content[0].text)
JavaScript/Node.js Integration
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.LANGMART_API_KEY,
baseURL: "https://api.langmart.ai/v1",
});
const completion = await openai.chat.completions.create({
model: "qwen/qwen-2.5-72b-instruct",
messages: [
{
role: "user",
content: "Your prompt here",
},
],
max_tokens: 2048,
});
console.log(completion.choices[0].message.content);
Performance Notes
Strengths
- Efficient inference with good quality
- Well-suited for production workloads
- Strong instruction-following ability
- Balanced performance and cost
Considerations
- Context length may be limited for very long documents
- Specialized for specific tasks
Additional Information
- Hugging Face Model: Qwen/Qwen2.5-72B-Instruct
- License: Open or Commercial (depends on provider)
- Streaming: Supported
- Function Calling: Depends on model configuration
- Vision Capabilities: No
- Web Search: No
Availability & Status
- LangMart Status: Available
- Rate Limits: Standard LangMart limits apply
- SLA: Subject to provider availability
Documentation Generated: 2025-12-24
Source: LangMart API & Public Documentation
Last Updated: December 2025