Q

Qwen2.5 Coder 32B Instruct

Qwen
33K
Context
$0.0300
Input /1M
$0.1100
Output /1M
33K
Max Output

Qwen2.5 Coder 32B Instruct

Overview

Property Value
Model ID qwen/qwen-2.5-coder-32b-instruct
Name Qwen2.5 Coder 32B Instruct
Author Qwen
Release Date November 11, 2024
Context Length 32,768 tokens
Modalities Text (input/output)

Description

Qwen2.5 Coder 32B Instruct is a code-focused large language model representing the latest iteration in the Qwen coding series. It replaces the earlier CodeQwen1.5 with significantly enhanced capabilities for:

  • Code Generation: Creating high-quality code across multiple programming languages
  • Code Reasoning: Understanding and explaining complex codebases
  • Code Fixing: Debugging and correcting code issues
  • Code Agents: Specialized support for development agent applications

The model maintains strong performance in mathematics and general tasks while prioritizing development applications.

Technical Specifications

  • Architecture: Qwen2.5 series (32 billion parameters)
  • Optimization: Instruction-tuned for code generation and reasoning
  • Quantization Available: FP8 (via Chutes provider)
  • Max Context Window: 32,768 tokens
  • Max Output Tokens: 32,768 tokens

Pricing

Chutes Provider (FP8 Quantization)

Type Cost per Million Tokens
Input $0.03
Output $0.11

Supported Parameters

Parameter Supported
max_tokens Yes
temperature Yes
top_p Yes
top_k Yes
stop (sequences) Yes
frequency_penalty Yes
presence_penalty Yes
repetition_penalty Yes
seed Yes
response_format Yes
structured_outputs Yes

Use Cases

  1. Code Generation: Writing functions, classes, and complete applications
  2. Code Review: Analyzing code quality and suggesting improvements
  3. Debugging: Identifying and fixing bugs in existing code
  4. Code Explanation: Documenting and explaining complex code
  5. Code Translation: Converting code between programming languages
  6. Code Agents: Powering autonomous coding assistants
  7. Mathematics: Solving mathematical problems with code
  8. General Tasks: Handling general-purpose text generation

Qwen Coder Series

  • qwen/qwen-2.5-coder-7b-instruct - Smaller 7B variant
  • qwen/qwen-2.5-coder-14b-instruct - Medium 14B variant
  • qwen/qwen-2.5-coder-32b-instruct - This model (largest)

Other Qwen Models

  • qwen/qwen-2.5-72b-instruct - General purpose 72B model
  • qwen/qwen-2.5-32b-instruct - General purpose 32B model
  • qwen/qwen-2.5-14b-instruct - General purpose 14B model

Providers

Provider Quantization Max Completion Tokens Data Policy
Chutes FP8 32,768 Training allowed; retains prompts; does not publish

Features

Feature Status
Tool Choice Supported (none, auto, required, function)
Multipart Input Enabled
Chat Completions Supported
Standard Completions Supported
Abortable Requests Yes

Tool Choice Options

  • literal_none: Disable tool usage
  • literal_auto: Automatic tool selection
  • literal_required: Force tool usage
  • type_function: Specific function calling

Usage Statistics

The model has seen heavy adoption, with daily request volumes exceeding 100,000+ during peak periods in late November-December 2025.

API Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen/qwen-2.5-coder-32b-instruct",
    "messages": [
      {
        "role": "user",
        "content": "Write a Python function to calculate the Fibonacci sequence"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 1024
  }'

Notes

  • The model excels at code-related tasks but also performs well on general language tasks
  • FP8 quantization provides a good balance between performance and resource usage
  • The 32K context window allows for processing large codebases
  • Supports structured outputs for reliable JSON generation

Source