LangMart: Mistral: Mistral Small 3

Model Overview

Property	Value
Model ID	`openrouter/mistralai/mistral-small-24b-instruct-2501`
Name	Mistral: Mistral Small 3
Provider	mistralai
Released	2025-01-30

Description

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed for efficient local deployment.

The model achieves 81% accuracy on the MMLU benchmark and performs competitively with larger models like Llama 3.3 70B and Qwen 32B, while operating at three times the speed on equivalent hardware. Read the blog post about the model here.

Description

LangMart: Mistral: Mistral Small 3 is a language model provided by mistralai. This model offers advanced capabilities for natural language processing tasks.

Provider

mistralai

Specifications

Spec	Value
Context Window	32,768 tokens
Modalities	text->text
Input Modalities	text
Output Modalities	text

Pricing

Type	Price
Input	$0.03 per 1M tokens
Output	$0.11 per 1M tokens

Capabilities

Frequency penalty
Logit bias
Max tokens
Min p
Presence penalty
Repetition penalty
Response format
Seed
Stop
Structured outputs
Temperature
Tool choice
Tools
Top k
Top p

Detailed Analysis

Mistral Small 24B Instruct 2501 (January 2025) is a 24-billion parameter model achieving state-of-the-art performance comparable to models 3x its size through architectural innovations and advanced training. The model hits 81% on MMLU benchmark, performing competitively with Llama 3.3 70B and Qwen 32B while operating at 3x the speed. This efficiency breakthrough enables high-quality AI at dramatically lower computational cost. Mistral Small 2501 excels at fast-response conversational agents requiring intelligence without latency, low-latency function calling for real-time tool integration, cost-optimized text generation maintaining quality standards, and applications where model efficiency directly impacts user experience. The 24B size represents an optimal efficiency frontier - significantly more capable than 7B models while remaining deployable on single high-end GPUs. The January 2025 release incorporates latest training techniques, ensuring the model handles 2025 knowledge, modern language patterns, and current events.