Qwen2.5 Coder 32B Instruct

Overview

Property	Value
Model ID	`qwen/qwen-2.5-coder-32b-instruct`
Name	Qwen2.5 Coder 32B Instruct
Author	Qwen
Release Date	November 11, 2024
Context Length	32,768 tokens
Modalities	Text (input/output)

Description

Qwen2.5 Coder 32B Instruct is a code-focused large language model representing the latest iteration in the Qwen coding series. It replaces the earlier CodeQwen1.5 with significantly enhanced capabilities for:

Code Generation: Creating high-quality code across multiple programming languages
Code Reasoning: Understanding and explaining complex codebases
Code Fixing: Debugging and correcting code issues
Code Agents: Specialized support for development agent applications

The model maintains strong performance in mathematics and general tasks while prioritizing development applications.

Technical Specifications

Architecture: Qwen2.5 series (32 billion parameters)
Optimization: Instruction-tuned for code generation and reasoning
Quantization Available: FP8 (via Chutes provider)
Max Context Window: 32,768 tokens
Max Output Tokens: 32,768 tokens

Pricing

Chutes Provider (FP8 Quantization)

Type	Cost per Million Tokens
Input	$0.03
Output	$0.11

Supported Parameters

Parameter	Supported
`max_tokens`	Yes
`temperature`	Yes
`top_p`	Yes
`top_k`	Yes
`stop` (sequences)	Yes
`frequency_penalty`	Yes
`presence_penalty`	Yes
`repetition_penalty`	Yes
`seed`	Yes
`response_format`	Yes
`structured_outputs`	Yes

Use Cases

Code Generation: Writing functions, classes, and complete applications
Code Review: Analyzing code quality and suggesting improvements
Debugging: Identifying and fixing bugs in existing code
Code Explanation: Documenting and explaining complex code
Code Translation: Converting code between programming languages
Code Agents: Powering autonomous coding assistants
Mathematics: Solving mathematical problems with code
General Tasks: Handling general-purpose text generation

Qwen Coder Series

qwen/qwen-2.5-coder-7b-instruct - Smaller 7B variant
qwen/qwen-2.5-coder-14b-instruct - Medium 14B variant
qwen/qwen-2.5-coder-32b-instruct - This model (largest)

Other Qwen Models

qwen/qwen-2.5-72b-instruct - General purpose 72B model
qwen/qwen-2.5-32b-instruct - General purpose 32B model
qwen/qwen-2.5-14b-instruct - General purpose 14B model

Providers

Provider	Quantization	Max Completion Tokens	Data Policy
Chutes	FP8	32,768	Training allowed; retains prompts; does not publish

Features

Feature	Status
Tool Choice	Supported (none, auto, required, function)
Multipart Input	Enabled
Chat Completions	Supported
Standard Completions	Supported
Abortable Requests	Yes

Tool Choice Options

literal_none: Disable tool usage
literal_auto: Automatic tool selection
literal_required: Force tool usage
type_function: Specific function calling

Usage Statistics

The model has seen heavy adoption, with daily request volumes exceeding 100,000+ during peak periods in late November-December 2025.

API Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen-2.5-coder-32b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "Write a Python function to calculate the Fibonacci sequence"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Notes

The model excels at code-related tasks but also performs well on general language tasks
FP8 quantization provides a good balance between performance and resource usage
The 32K context window allows for processing large codebases
Supports structured outputs for reliable JSON generation

Qwen2.5 Coder 32B Instruct

Qwen2.5 Coder 32B Instruct

Overview

Description

Technical Specifications

Pricing

Chutes Provider (FP8 Quantization)

Supported Parameters

Use Cases

Related Models

Qwen Coder Series

Other Qwen Models

Providers

Features

Tool Choice Options

Usage Statistics

API Example

Notes

Source