M

Mistral AI: Mistral Small

Mistral AI
32K
Context
$0.2000
Input /1M
$0.6000
Output /1M
N/A
Max Output

Mistral AI: Mistral Small

Description

Mistral Small is a 22-billion parameter model serving as a convenient mid-point between smaller and larger Mistral options. It emphasizes reasoning capabilities, code generation, and multilingual support for English, French, German, Italian, and Spanish.

Pricing

Type Price
Input $0.20 per 1M tokens
Output $0.60 per 1M tokens

Cost Profile: Cost-effective alternative to larger models. Mistral Small offers excellent value for code generation and multilingual tasks.

Capabilities

  • Text-to-text inference
  • Code production and reasoning
  • Multilingual text processing
  • Cost-effective deployment
  • Function calling support
  • Structured output generation
  • JSON mode support

Use Cases

  1. Code Development - Strong coding capabilities
  2. Reasoning Tasks - Good for complex problem-solving
  3. Multilingual Applications - Support for 5 languages
  4. API Integration - Reliable function calling
  5. Cost-Sensitive Deployments - Efficient 22B size
  6. Production Services - Stable and reliable

Integration with LangMart

Gateway Support: Type 2 (Cloud), Type 3 (Self-hosted)

Recommended Setup:

./core.sh start 2  # Cloud gateway
./core.sh start 3  # Self-hosted gateway

API Usage:

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/mistral-small",
    "messages": [{"role": "user", "content": "Write a Python function"}],
    "temperature": 0.3
  }'

From Mistral AI:

  • Mistral Medium - Larger variant with more capabilities
  • Mistral Large - Full-featured large model
  • Mistral Next (Alias) - Latest optimized version

Model Information

Model ID (API): mistralai/mistral-small

Provider: Mistral AI

Release Date: January 10, 2024

Latest Update: November 10, 2025

Model Architecture: Transformer-based dense architecture

Parameters: 22 billion

Context Window: 32,000 tokens

Input/Output Specifications

Input Modalities: Text

Output Modalities: Text

Default Temperature: 0.3

Max Context: 32,000 tokens

Performance Metrics

Recent Activity (December 4, 2025):

  • Requests Processed: 28,058
  • Prompt Tokens: 38.05 million
  • Completion Tokens: 1.25 million
  • Tool Calls: 3,456

Daily Usage Range: 8,000 - 175,000+ requests

Trending: Consistent daily usage with strong adoption

Model Capabilities & Features

Supported Parameters

  • Temperature control
  • Top-p sampling
  • Stop sequences
  • Max tokens
  • Frequency/presence penalties
  • Tool calling parameters

Strengths

  • Strong reasoning abilities
  • Excellent code generation
  • Multilingual support (5 languages)
  • Cost-effective pricing
  • Good instruction following
  • Reliable function calling
  • Fast inference speed (22B size)

Languages Supported

  • English
  • French
  • German
  • Italian
  • Spanish

Performance Characteristics

  • Inference Speed: Fast (22B parameter model)
  • Reasoning: Enhanced capabilities for complex tasks
  • Code Quality: High-quality code generation
  • Multilingual: Strong across 5 major languages
  • Tool Usage: Reliable function calling

Performance Recommendations

Best For:

  • Teams needing cost-effective solutions
  • Multilingual applications
  • Code-heavy workloads
  • High-throughput systems
  • Resource-constrained deployments

Trade-offs:

  • Smaller than Mistral Large
  • Less capable than enterprise models
  • Limited context vs. newer models

Deployment Notes

  • Excellent for production deployment
  • Suitable for scaling (high throughput)
  • Good for cost-optimization initiatives
  • Mid-range capability sweet spot
  • Strong multilingual support

Testing Results

Inference Speed: Excellent (22B parameter model)

Reasoning Quality: Good (mid-tier Mistral)

Code Generation: Strong

Function Calling: Reliable

References


Last Updated: December 24, 2025