M

Mistral AI Codestral

Mistral AI
Vision
256K
Context
$0.9000
Input /1M
$0.9000
Output /1M
N/A
Max Output

Mistral AI Codestral

Model Overview

Property Value
Model Name Codestral
Provider Mistral AI
Model ID mistralai/codestral-latest
Current Version mistralai/codestral-2508 (v25.08)
Model Type Premier (Proprietary)
Category Code Generation / Specialist
Parameter Count 22 billion
Context Window 256,000 tokens
Architecture Transformer-based
License MNPL-0.1 (Mistral Non-Production License)
Initial Release May 29, 2024
Latest Version Release August 1, 2025

Description

Codestral is Mistral AI's cutting-edge language model explicitly designed for code generation tasks. It is Mistral's inaugural code-specific generative model, representing an open-weight generative AI model optimized for programming and software development workflows.

The model specializes in low-latency, high-frequency tasks such as:

  • Fill-in-the-Middle (FIM): Predict and generate code between prefix and suffix tokens
  • Code Correction: Identify and fix errors in existing code
  • Test Generation: Automatically generate unit tests and test cases
  • Code Completion: Intelligent autocomplete for various programming languages
  • Documentation Generation: Write code documentation and explanations
  • Code Refactoring: Suggest improvements and restructure code

Technical Specifications

Model Versions

Version API Name Status Release Date Notes
v25.08 codestral-2508 Active August 2025 Current production version
v25.01 codestral-2501 Deprecated January 2025 Retired Nov 30, 2025
v24.05 codestral-2405 Retired May 2024 Original release

Supported Parameters

Parameter Type Description
temperature float Controls randomness (0.0-1.0)
max_tokens integer Maximum tokens to generate
min_tokens integer Minimum tokens to enforce
top_p float Nucleus sampling parameter
stop string/array Stop sequences
frequency_penalty float Penalty for token frequency
presence_penalty float Penalty for token presence
seed integer Random seed for reproducibility
tools array Function/tool definitions
tool_choice string Tool selection mode (auto, required, none)
structured_outputs boolean Enable structured JSON output
response_format object Output format specification

Modalities

  • Input: Text only
  • Output: Text only

Pricing

Current Pricing (Codestral 2508)

Type Cost per Million Tokens Cost per 1K Tokens
Input $0.30 $0.0003
Output $0.90 $0.0009

Example Cost Calculation

  • 1,000 input tokens + 500 output tokens = ~$0.00075

Pricing Context

Codestral occupies a mid-range pricing tier among coding models:

  • More expensive than entry-level models like Gemma-3-4B-IT ($0.017/M input)
  • Less costly than premium offerings like Claude Opus 4 ($15.00/M input)
  • Comparable to Mistral's Devstral Medium ($0.30/M input, $0.90/M output)

Limitations

  1. No Built-in Moderation: The model does not have built-in safety guardrails
  2. Code-Focused: Not optimized for general conversational tasks
  3. Commercial License Required: MNPL-0.1 requires separate commercial license for production use
  4. No Image/Audio: Text-only input and output

Codestral Variants

Model Parameters Context Use Case
Codestral Mamba 7.3B 256K Lightweight, infinite sequence length
Codestral Embed - 8K Code embeddings for RAG

Alternative Coding Models

Model Provider Strengths
DeepSeek Coder 33B DeepSeek Strong MBPP performance
CodeLlama 70B Meta Open-weight alternative
Devstral 2 Mistral AI Agentic coding (123B)
Devstral Medium Mistral AI Balance of speed and capability

Providers

Model Availability

Model ID Context Status
mistralai/codestral-2508 256K Available
mistralai/codestral-2501 256K Deprecated
mistralai/codestral-latest 256K Redirects to latest

Direct API Access

  1. codestral.mistral.ai - Monthly subscription (currently free), requires phone verification
  2. api.mistral.ai - Pay-per-use with existing API keys, better for business applications

Language Support

Codestral demonstrates proficiency across 80+ programming languages, including:

  • Python
  • Java
  • JavaScript/TypeScript
  • C/C++
  • C#
  • Go
  • Rust
  • PHP
  • Ruby

Additional Languages

  • Swift
  • Kotlin
  • Bash/Shell
  • SQL
  • Fortran
  • Scala
  • R
  • Lua
  • Perl
  • And 60+ more

Performance Benchmarks

Code Generation Benchmarks

Benchmark Codestral 22B CodeLlama 70B DeepSeek Coder 33B Llama 3 70B
HumanEval (Python) 81.1% 67.1% 77.4% 76.2%
MBPP 78.2% 70.8% 80.2% 76.7%
CruxEval-O 51.3% 47.3% 49.5% 26.0%
RepoBench EM 34.0% 11.4% 28.4% 18.4%
Spider (SQL) 63.5% 37.0% 60.0% -

Key Performance Highlights

  1. Python Code Generation: 81.1% pass rate on HumanEval, outperforming CodeLlama 70B by 14 percentage points
  2. Repository-Level Tasks: 34% on RepoBench, 3x better than CodeLlama 70B due to 32K context window advantage
  3. SQL Generation: 63.5% on Spider benchmark, significantly ahead of competitors
  4. Multi-Language: Strong performance across 6+ tested languages on averaged HumanEval

API Usage Examples

Chat Completion (Python)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.langmart.ai/v1",
    api_key="YOUR_LANGMART_API_KEY"
)

response = client.chat.completions.create(
    model="mistralai/codestral-2508",
    messages=[
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
    ],
    temperature=0.3,
    max_tokens=1000
)

print(response.choices[0].message.content)

Fill-in-the-Middle (FIM)

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")

prefix = """def add("""
suffix = """    return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)
tokens = tokenizer.encode_fim(request).tokens
out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0)
result = tokenizer.decode(out_tokens[0])

cURL Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/codestral-2508",
    "messages": [
      {"role": "user", "content": "Explain this code: def factorial(n): return 1 if n <= 1 else n * factorial(n-1)"}
    ],
    "temperature": 0.3
  }'

Integrations

IDE Plugins

  • Continue.dev - VS Code and JetBrains integration
  • Tabnine - AI coding assistant integration
  • Cursor - AI-powered code editor

Frameworks

  • LangChain - Python/JS framework integration
  • LlamaIndex - RAG and data framework support
  • Jupyter AI - Jupyter notebook integration

Development Tools

  • E2B - Secure code execution sandboxes
  • Tabby - Self-hosted AI coding assistant

Usage Statistics

Based on recent data (December 2025):

  • Daily Requests: ~18,000+ requests/day
  • Prompt Tokens: ~47B tokens processed daily
  • Completion Tokens: ~4B tokens generated daily

Version History

Date Version Changes
Aug 2025 v25.08 Latest version, improved architecture
Jan 2025 v25.01 2x faster generation, 256K context
May 2024 v24.05 Initial release, 32K context

Sources