Qwen2.5 Coder 32B Instruct
Overview
| Property |
Value |
| Model ID |
qwen/qwen-2.5-coder-32b-instruct |
| Name |
Qwen2.5 Coder 32B Instruct |
| Author |
Qwen |
| Release Date |
November 11, 2024 |
| Context Length |
32,768 tokens |
| Modalities |
Text (input/output) |
Description
Qwen2.5 Coder 32B Instruct is a code-focused large language model representing the latest iteration in the Qwen coding series. It replaces the earlier CodeQwen1.5 with significantly enhanced capabilities for:
- Code Generation: Creating high-quality code across multiple programming languages
- Code Reasoning: Understanding and explaining complex codebases
- Code Fixing: Debugging and correcting code issues
- Code Agents: Specialized support for development agent applications
The model maintains strong performance in mathematics and general tasks while prioritizing development applications.
Technical Specifications
- Architecture: Qwen2.5 series (32 billion parameters)
- Optimization: Instruction-tuned for code generation and reasoning
- Quantization Available: FP8 (via Chutes provider)
- Max Context Window: 32,768 tokens
- Max Output Tokens: 32,768 tokens
Pricing
Chutes Provider (FP8 Quantization)
| Type |
Cost per Million Tokens |
| Input |
$0.03 |
| Output |
$0.11 |
Supported Parameters
| Parameter |
Supported |
max_tokens |
Yes |
temperature |
Yes |
top_p |
Yes |
top_k |
Yes |
stop (sequences) |
Yes |
frequency_penalty |
Yes |
presence_penalty |
Yes |
repetition_penalty |
Yes |
seed |
Yes |
response_format |
Yes |
structured_outputs |
Yes |
Use Cases
- Code Generation: Writing functions, classes, and complete applications
- Code Review: Analyzing code quality and suggesting improvements
- Debugging: Identifying and fixing bugs in existing code
- Code Explanation: Documenting and explaining complex code
- Code Translation: Converting code between programming languages
- Code Agents: Powering autonomous coding assistants
- Mathematics: Solving mathematical problems with code
- General Tasks: Handling general-purpose text generation
Qwen Coder Series
qwen/qwen-2.5-coder-7b-instruct - Smaller 7B variant
qwen/qwen-2.5-coder-14b-instruct - Medium 14B variant
qwen/qwen-2.5-coder-32b-instruct - This model (largest)
Other Qwen Models
qwen/qwen-2.5-72b-instruct - General purpose 72B model
qwen/qwen-2.5-32b-instruct - General purpose 32B model
qwen/qwen-2.5-14b-instruct - General purpose 14B model
Providers
| Provider |
Quantization |
Max Completion Tokens |
Data Policy |
| Chutes |
FP8 |
32,768 |
Training allowed; retains prompts; does not publish |
Features
| Feature |
Status |
| Tool Choice |
Supported (none, auto, required, function) |
| Multipart Input |
Enabled |
| Chat Completions |
Supported |
| Standard Completions |
Supported |
| Abortable Requests |
Yes |
literal_none: Disable tool usage
literal_auto: Automatic tool selection
literal_required: Force tool usage
type_function: Specific function calling
Usage Statistics
The model has seen heavy adoption, with daily request volumes exceeding 100,000+ during peak periods in late November-December 2025.
API Example
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer $LANGMART_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen/qwen-2.5-coder-32b-instruct",
"messages": [
{
"role": "user",
"content": "Write a Python function to calculate the Fibonacci sequence"
}
],
"temperature": 0.7,
"max_tokens": 1024
}'
Notes
- The model excels at code-related tasks but also performs well on general language tasks
- FP8 quantization provides a good balance between performance and resource usage
- The 32K context window allows for processing large codebases
- Supports structured outputs for reliable JSON generation
Source