Qwen 2.5 72B Instruct

Model ID: qwen/qwen-2.5-72b-instruct
Provider: Qwen - Alibaba's Qwen family
Canonical Slug: qwen/qwen-2.5-72b-instruct

Overview

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2:

Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains.
Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots.
Long-context Support up to 128K tokens and can generate up to 8K tokens.
Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

Usage of this model is subject to Tongyi Qianwen LICENSE AGREEMENT.

Specifications

Specification	Value
Context Window	32,768 tokens
Max Output Tokens	Unknown
Modality	text->text
Model Architecture	text to text
Release Date	1726704000

Pricing

Metric	Price
Prompt Cost	$0.00000012 per 1M tokens
Completion Cost	$0.00000039 per 1M tokens
Currency	USD

Capabilities

Text Generation
Instruction Following

Supported Parameters

The model supports the following parameters in API requests:

temperature: Controls randomness (0.0 - 2.0), default: 1.0
top_p: Nucleus sampling (0.0 - 1.0), default: 1.0
top_k: Top-k filtering
frequency_penalty: Reduces repetition (-2.0 to 2.0)
presence_penalty: Encourages new topics (-2.0 to 2.0)
repetition_penalty: Alternative repetition control (0.5 - 2.0)
stop: Stop sequences
seed: Random seed for reproducibility
max_tokens: Maximum output length

API Usage Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen-2.5-72b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms"
      }
    ],
    "temperature": 1.0,
    "max_tokens": 2048,
    "top_p": 1.0
  }'

Performance Metrics

Speed & Quality Tradeoff

Inference Speed: Fast
Quality Tier: Advanced
Cost Efficiency: Optimized for production

Recommended Use Cases

Long-form text generation
Code generation and analysis
Conversational AI
Complex reasoning tasks
Information synthesis

From Same Provider

qwen/qwen3-max
qwen/qwen3-coder-plus
qwen/qwen-2.5-72b-instruct
qwen/qwen-2.5-32b-instruct
qwen/qwen-2.5-14b-instruct

Comparable Models from Other Providers

OpenAI: GPT-4 Turbo, GPT-4o
Anthropic: Claude 3.5 Sonnet
Google: Gemini 2.0 Flash
DeepSeek: DeepSeek-R1

Python Integration

import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://api.langmart.ai/v1"
)

message = client.messages.create(
    model="qwen/qwen-2.5-72b-instruct",
    max_tokens=2048,
    messages=[
        {
            "role": "user",
            "content": "Your prompt here"
        }
    ]
)

print(message.content[0].text)

JavaScript/Node.js Integration

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.LANGMART_API_KEY,
  baseURL: "https://api.langmart.ai/v1",
});

const completion = await openai.chat.completions.create({
  model: "qwen/qwen-2.5-72b-instruct",
  messages: [
    {
      role: "user",
      content: "Your prompt here",
    },
  ],
  max_tokens: 2048,
});

console.log(completion.choices[0].message.content);

Performance Notes

Strengths

Efficient inference with good quality
Well-suited for production workloads
Strong instruction-following ability
Balanced performance and cost

Considerations

Context length may be limited for very long documents
Specialized for specific tasks

Additional Information

Hugging Face Model: Qwen/Qwen2.5-72B-Instruct
License: Open or Commercial (depends on provider)
Streaming: Supported
Function Calling: Depends on model configuration
Vision Capabilities: No
Web Search: No

Availability & Status

LangMart Status: Available
Rate Limits: Standard LangMart limits apply
SLA: Subject to provider availability

Documentation Generated: 2025-12-24
Source: LangMart API & Public Documentation
Last Updated: December 2025

Qwen 2.5 72B Instruct

Qwen 2.5 72B Instruct

Overview

Specifications

Pricing

Capabilities

Supported Parameters

API Usage Example

Performance Metrics

Speed & Quality Tradeoff

Recommended Use Cases

Related & Alternative Models

From Same Provider

Comparable Models from Other Providers

Python Integration

JavaScript/Node.js Integration

Performance Notes

Strengths

Considerations

Additional Information

Availability & Status