O

LangMart: Mistral: Pixtral Large 2411

Openrouter
Vision
131K
Context
$2.00
Input /1M
$6.00
Output /1M
N/A
Max Output

LangMart: Mistral: Pixtral Large 2411

Model Overview

Property Value
Model ID openrouter/mistralai/pixtral-large-2411
Name Mistral: Pixtral Large 2411
Provider mistralai
Released 2024-11-19

Description

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of Mistral Large 2. The model is able to understand documents, charts and natural images.

The model is available under the Mistral Research License (MRL) for research and educational use, and the Mistral Commercial License for experimentation, testing, and production for commercial purposes.

Description

LangMart: Mistral: Pixtral Large 2411 is a language model provided by mistralai. This model offers advanced capabilities for natural language processing tasks.

Provider

mistralai

Specifications

Spec Value
Context Window 131,072 tokens
Modalities text+image->text
Input Modalities text, image
Output Modalities text

Pricing

Type Price
Input $2.00 per 1M tokens
Output $6.00 per 1M tokens

Capabilities

  • Frequency penalty
  • Max tokens
  • Presence penalty
  • Response format
  • Seed
  • Stop
  • Structured outputs
  • Temperature
  • Tool choice
  • Tools
  • Top p

Detailed Analysis

Pixtral Large 2411 (November 2024) is a 124B multimodal powerhouse built on Mistral Large 2 (2407), representing frontier-level vision-language capabilities. This model combines Mistral Large's exceptional reasoning and language understanding with advanced vision processing, achieving state-of-the-art results across multimodal benchmarks. On MathVista (mathematical reasoning over visual data), Pixtral Large achieves 69.4%, outperforming all competitors including GPT-4V. The model excels at document analysis (complex financial reports, legal documents with diagrams), chart and graph interpretation requiring deep reasoning, natural image understanding with nuanced context, and multi-image reasoning across 30+ high-resolution images within its 128K context window. Pixtral Large represents the pinnacle of open-weights multimodal AI, enabling sophisticated vision-language applications previously requiring proprietary models. Ideal for enterprise document intelligence, advanced data visualization analysis, scientific figure interpretation, accessibility solutions requiring detailed image understanding, and research into multimodal architectures. The 124B scale enables reasoning depth matching text-only frontier models while processing visual information.