LangMart: Qwen: Qwen3 VL 32B Instruct
Model Overview
| Property | Value |
|---|---|
| Model ID | openrouter/qwen/qwen3-vl-32b-instruct |
| Name | Qwen: Qwen3 VL 32B Instruct |
| Provider | qwen |
| Released | 2025-10-23 |
Description
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.
Description
LangMart: Qwen: Qwen3 VL 32B Instruct is a language model provided by qwen. This model offers advanced capabilities for natural language processing tasks.
Provider
qwen
Specifications
| Spec | Value |
|---|---|
| Context Window | 262,144 tokens |
| Modalities | text+image->text |
| Input Modalities | text, image |
| Output Modalities | text |
Pricing
| Type | Price |
|---|---|
| Input | $0.50 per 1M tokens |
| Output | $1.50 per 1M tokens |
Capabilities
- Frequency penalty
- Logit bias
- Max tokens
- Min p
- Presence penalty
- Repetition penalty
- Response format
- Stop
- Structured outputs
- Temperature
- Top k
- Top p