Mistral AI Codestral

Model Overview

Property	Value
Model Name	Codestral
Provider	Mistral AI
Model ID	`mistralai/codestral-latest`
Current Version	`mistralai/codestral-2508` (v25.08)
Model Type	Premier (Proprietary)
Category	Code Generation / Specialist
Parameter Count	22 billion
Context Window	256,000 tokens
Architecture	Transformer-based
License	MNPL-0.1 (Mistral Non-Production License)
Initial Release	May 29, 2024
Latest Version Release	August 1, 2025

Description

Codestral is Mistral AI's cutting-edge language model explicitly designed for code generation tasks. It is Mistral's inaugural code-specific generative model, representing an open-weight generative AI model optimized for programming and software development workflows.

The model specializes in low-latency, high-frequency tasks such as:

Fill-in-the-Middle (FIM): Predict and generate code between prefix and suffix tokens
Code Correction: Identify and fix errors in existing code
Test Generation: Automatically generate unit tests and test cases
Code Completion: Intelligent autocomplete for various programming languages
Documentation Generation: Write code documentation and explanations
Code Refactoring: Suggest improvements and restructure code

Technical Specifications

Model Versions

Version	API Name	Status	Release Date	Notes
v25.08	`codestral-2508`	Active	August 2025	Current production version
v25.01	`codestral-2501`	Deprecated	January 2025	Retired Nov 30, 2025
v24.05	`codestral-2405`	Retired	May 2024	Original release

Supported Parameters

Parameter	Type	Description
`temperature`	float	Controls randomness (0.0-1.0)
`max_tokens`	integer	Maximum tokens to generate
`min_tokens`	integer	Minimum tokens to enforce
`top_p`	float	Nucleus sampling parameter
`stop`	string/array	Stop sequences
`frequency_penalty`	float	Penalty for token frequency
`presence_penalty`	float	Penalty for token presence
`seed`	integer	Random seed for reproducibility
`tools`	array	Function/tool definitions
`tool_choice`	string	Tool selection mode (auto, required, none)
`structured_outputs`	boolean	Enable structured JSON output
`response_format`	object	Output format specification

Modalities

Input: Text only
Output: Text only

Pricing

Current Pricing (Codestral 2508)

Type	Cost per Million Tokens	Cost per 1K Tokens
Input	$0.30	$0.0003
Output	$0.90	$0.0009

Example Cost Calculation

1,000 input tokens + 500 output tokens = ~$0.00075

Pricing Context

Codestral occupies a mid-range pricing tier among coding models:

More expensive than entry-level models like Gemma-3-4B-IT ($0.017/M input)
Less costly than premium offerings like Claude Opus 4 ($15.00/M input)
Comparable to Mistral's Devstral Medium ($0.30/M input, $0.90/M output)

Limitations

No Built-in Moderation: The model does not have built-in safety guardrails
Code-Focused: Not optimized for general conversational tasks
Commercial License Required: MNPL-0.1 requires separate commercial license for production use
No Image/Audio: Text-only input and output

Codestral Variants

Model	Parameters	Context	Use Case
Codestral Mamba	7.3B	256K	Lightweight, infinite sequence length
Codestral Embed	-	8K	Code embeddings for RAG

Alternative Coding Models

Model	Provider	Strengths
DeepSeek Coder 33B	DeepSeek	Strong MBPP performance
CodeLlama 70B	Meta	Open-weight alternative
Devstral 2	Mistral AI	Agentic coding (123B)
Devstral Medium	Mistral AI	Balance of speed and capability

Providers

Model Availability

Model ID	Context	Status
`mistralai/codestral-2508`	256K	Available
`mistralai/codestral-2501`	256K	Deprecated
`mistralai/codestral-latest`	256K	Redirects to latest

Direct API Access

codestral.mistral.ai - Monthly subscription (currently free), requires phone verification
api.mistral.ai - Pay-per-use with existing API keys, better for business applications

Language Support

Codestral demonstrates proficiency across 80+ programming languages, including:

Popular Languages

Python
Java
JavaScript/TypeScript
C/C++
C#
Go
Rust
PHP
Ruby

Additional Languages

Swift
Kotlin
Bash/Shell
SQL
Fortran
Scala
R
Lua
Perl
And 60+ more

Performance Benchmarks

Code Generation Benchmarks

Benchmark	Codestral 22B	CodeLlama 70B	DeepSeek Coder 33B	Llama 3 70B
HumanEval (Python)	81.1%	67.1%	77.4%	76.2%
MBPP	78.2%	70.8%	80.2%	76.7%
CruxEval-O	51.3%	47.3%	49.5%	26.0%
RepoBench EM	34.0%	11.4%	28.4%	18.4%
Spider (SQL)	63.5%	37.0%	60.0%	-

Key Performance Highlights

Python Code Generation: 81.1% pass rate on HumanEval, outperforming CodeLlama 70B by 14 percentage points
Repository-Level Tasks: 34% on RepoBench, 3x better than CodeLlama 70B due to 32K context window advantage
SQL Generation: 63.5% on Spider benchmark, significantly ahead of competitors
Multi-Language: Strong performance across 6+ tested languages on averaged HumanEval

API Usage Examples

Chat Completion (Python)

from openai import OpenAI

client = OpenAI(
    base_url="https://api.langmart.ai/v1",
    api_key="YOUR_LANGMART_API_KEY"
)

response = client.chat.completions.create(
    model="mistralai/codestral-2508",
    messages=[
        {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
    ],
    temperature=0.3,
    max_tokens=1000
)

print(response.choices[0].message.content)

Fill-in-the-Middle (FIM)

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")

prefix = """def add("""
suffix = """    return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)
tokens = tokenizer.encode_fim(request).tokens
out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0)
result = tokenizer.decode(out_tokens[0])

cURL Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/codestral-2508",
    "messages": [
      {"role": "user", "content": "Explain this code: def factorial(n): return 1 if n <= 1 else n * factorial(n-1)"}
    ],
    "temperature": 0.3
  }'

Integrations

IDE Plugins

Continue.dev - VS Code and JetBrains integration
Tabnine - AI coding assistant integration
Cursor - AI-powered code editor

Frameworks

LangChain - Python/JS framework integration
LlamaIndex - RAG and data framework support
Jupyter AI - Jupyter notebook integration

Development Tools

E2B - Secure code execution sandboxes
Tabby - Self-hosted AI coding assistant

Usage Statistics

Based on recent data (December 2025):

Daily Requests: ~18,000+ requests/day
Prompt Tokens: ~47B tokens processed daily
Completion Tokens: ~4B tokens generated daily

Version History

Date	Version	Changes
Aug 2025	v25.08	Latest version, improved architecture
Jan 2025	v25.01	2x faster generation, 256K context
May 2024	v24.05	Initial release, 32K context

Mistral AI Codestral

Mistral AI Codestral

Model Overview

Description

Technical Specifications

Model Versions

Supported Parameters

Modalities

Pricing

Current Pricing (Codestral 2508)

Example Cost Calculation

Pricing Context

Limitations

Related Models

Codestral Variants

Alternative Coding Models

Providers

Model Availability

Direct API Access

Language Support

Popular Languages

Additional Languages

Performance Benchmarks

Code Generation Benchmarks

Key Performance Highlights

API Usage Examples

Chat Completion (Python)

Fill-in-the-Middle (FIM)

cURL Example

Integrations

IDE Plugins

Frameworks

Development Tools

Usage Statistics

Version History

Sources