Mistral AI Codestral
Model Overview
| Property |
Value |
| Model Name |
Codestral |
| Provider |
Mistral AI |
| Model ID |
mistralai/codestral-latest |
| Current Version |
mistralai/codestral-2508 (v25.08) |
| Model Type |
Premier (Proprietary) |
| Category |
Code Generation / Specialist |
| Parameter Count |
22 billion |
| Context Window |
256,000 tokens |
| Architecture |
Transformer-based |
| License |
MNPL-0.1 (Mistral Non-Production License) |
| Initial Release |
May 29, 2024 |
| Latest Version Release |
August 1, 2025 |
Description
Codestral is Mistral AI's cutting-edge language model explicitly designed for code generation tasks. It is Mistral's inaugural code-specific generative model, representing an open-weight generative AI model optimized for programming and software development workflows.
The model specializes in low-latency, high-frequency tasks such as:
- Fill-in-the-Middle (FIM): Predict and generate code between prefix and suffix tokens
- Code Correction: Identify and fix errors in existing code
- Test Generation: Automatically generate unit tests and test cases
- Code Completion: Intelligent autocomplete for various programming languages
- Documentation Generation: Write code documentation and explanations
- Code Refactoring: Suggest improvements and restructure code
Technical Specifications
Model Versions
| Version |
API Name |
Status |
Release Date |
Notes |
| v25.08 |
codestral-2508 |
Active |
August 2025 |
Current production version |
| v25.01 |
codestral-2501 |
Deprecated |
January 2025 |
Retired Nov 30, 2025 |
| v24.05 |
codestral-2405 |
Retired |
May 2024 |
Original release |
Supported Parameters
| Parameter |
Type |
Description |
temperature |
float |
Controls randomness (0.0-1.0) |
max_tokens |
integer |
Maximum tokens to generate |
min_tokens |
integer |
Minimum tokens to enforce |
top_p |
float |
Nucleus sampling parameter |
stop |
string/array |
Stop sequences |
frequency_penalty |
float |
Penalty for token frequency |
presence_penalty |
float |
Penalty for token presence |
seed |
integer |
Random seed for reproducibility |
tools |
array |
Function/tool definitions |
tool_choice |
string |
Tool selection mode (auto, required, none) |
structured_outputs |
boolean |
Enable structured JSON output |
response_format |
object |
Output format specification |
Modalities
- Input: Text only
- Output: Text only
Pricing
Current Pricing (Codestral 2508)
| Type |
Cost per Million Tokens |
Cost per 1K Tokens |
| Input |
$0.30 |
$0.0003 |
| Output |
$0.90 |
$0.0009 |
Example Cost Calculation
- 1,000 input tokens + 500 output tokens = ~$0.00075
Pricing Context
Codestral occupies a mid-range pricing tier among coding models:
- More expensive than entry-level models like Gemma-3-4B-IT ($0.017/M input)
- Less costly than premium offerings like Claude Opus 4 ($15.00/M input)
- Comparable to Mistral's Devstral Medium ($0.30/M input, $0.90/M output)
Limitations
- No Built-in Moderation: The model does not have built-in safety guardrails
- Code-Focused: Not optimized for general conversational tasks
- Commercial License Required: MNPL-0.1 requires separate commercial license for production use
- No Image/Audio: Text-only input and output
Codestral Variants
| Model |
Parameters |
Context |
Use Case |
| Codestral Mamba |
7.3B |
256K |
Lightweight, infinite sequence length |
| Codestral Embed |
- |
8K |
Code embeddings for RAG |
Alternative Coding Models
| Model |
Provider |
Strengths |
| DeepSeek Coder 33B |
DeepSeek |
Strong MBPP performance |
| CodeLlama 70B |
Meta |
Open-weight alternative |
| Devstral 2 |
Mistral AI |
Agentic coding (123B) |
| Devstral Medium |
Mistral AI |
Balance of speed and capability |
Providers
Model Availability
| Model ID |
Context |
Status |
mistralai/codestral-2508 |
256K |
Available |
mistralai/codestral-2501 |
256K |
Deprecated |
mistralai/codestral-latest |
256K |
Redirects to latest |
Direct API Access
- codestral.mistral.ai - Monthly subscription (currently free), requires phone verification
- api.mistral.ai - Pay-per-use with existing API keys, better for business applications
Language Support
Codestral demonstrates proficiency across 80+ programming languages, including:
Popular Languages
- Python
- Java
- JavaScript/TypeScript
- C/C++
- C#
- Go
- Rust
- PHP
- Ruby
Additional Languages
- Swift
- Kotlin
- Bash/Shell
- SQL
- Fortran
- Scala
- R
- Lua
- Perl
- And 60+ more
Code Generation Benchmarks
| Benchmark |
Codestral 22B |
CodeLlama 70B |
DeepSeek Coder 33B |
Llama 3 70B |
| HumanEval (Python) |
81.1% |
67.1% |
77.4% |
76.2% |
| MBPP |
78.2% |
70.8% |
80.2% |
76.7% |
| CruxEval-O |
51.3% |
47.3% |
49.5% |
26.0% |
| RepoBench EM |
34.0% |
11.4% |
28.4% |
18.4% |
| Spider (SQL) |
63.5% |
37.0% |
60.0% |
- |
- Python Code Generation: 81.1% pass rate on HumanEval, outperforming CodeLlama 70B by 14 percentage points
- Repository-Level Tasks: 34% on RepoBench, 3x better than CodeLlama 70B due to 32K context window advantage
- SQL Generation: 63.5% on Spider benchmark, significantly ahead of competitors
- Multi-Language: Strong performance across 6+ tested languages on averaged HumanEval
API Usage Examples
Chat Completion (Python)
from openai import OpenAI
client = OpenAI(
base_url="https://api.langmart.ai/v1",
api_key="YOUR_LANGMART_API_KEY"
)
response = client.chat.completions.create(
model="mistralai/codestral-2508",
messages=[
{"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}
],
temperature=0.3,
max_tokens=1000
)
print(response.choices[0].message.content)
Fill-in-the-Middle (FIM)
from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest
tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")
prefix = """def add("""
suffix = """ return sum"""
request = FIMRequest(prompt=prefix, suffix=suffix)
tokens = tokenizer.encode_fim(request).tokens
out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0)
result = tokenizer.decode(out_tokens[0])
cURL Example
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer $LANGMART_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mistralai/codestral-2508",
"messages": [
{"role": "user", "content": "Explain this code: def factorial(n): return 1 if n <= 1 else n * factorial(n-1)"}
],
"temperature": 0.3
}'
Integrations
IDE Plugins
- Continue.dev - VS Code and JetBrains integration
- Tabnine - AI coding assistant integration
- Cursor - AI-powered code editor
Frameworks
- LangChain - Python/JS framework integration
- LlamaIndex - RAG and data framework support
- Jupyter AI - Jupyter notebook integration
- E2B - Secure code execution sandboxes
- Tabby - Self-hosted AI coding assistant
Usage Statistics
Based on recent data (December 2025):
- Daily Requests: ~18,000+ requests/day
- Prompt Tokens: ~47B tokens processed daily
- Completion Tokens: ~4B tokens generated daily
Version History
| Date |
Version |
Changes |
| Aug 2025 |
v25.08 |
Latest version, improved architecture |
| Jan 2025 |
v25.01 |
2x faster generation, 256K context |
| May 2024 |
v24.05 |
Initial release, 32K context |
Sources