Microsoft Phi-4
Description
Phi-4 targets "complex reasoning tasks and operates efficiently with limited memory or when quick responses are needed." The 14-billion parameter model trained on synthetic datasets, curated websites, and academic materials. It emphasizes instruction-following accuracy and maintains safety standards, optimized for English inputs.
Technical Report: arXiv:2412.08905
Technical Specifications
| Specification | Value |
|---|---|
| Context Window | 16,000 tokens |
| Parameters | 14 billion |
| Context Length | 16,384 tokens |
| Input Modalities | Text |
| Output Modalities | Text |
Training Focus
- High-quality synthetic data
- Curated web content
- Academic materials
Pricing
| Type | Cost (per million tokens) |
|---|---|
| Input | $0.06 |
| Output | $0.14 |
Provider: NextBit (quantized as int4)
Capabilities
Phi-4 is designed for:
- Complex reasoning tasks
- Efficient operation with limited memory
- Quick response generation
- Instruction-following accuracy
- Safe and responsible outputs
Optimized for: English language inputs
Supported Parameters
| Parameter | Supported |
|---|---|
max_tokens |
Yes |
temperature |
Yes |
top_p |
Yes |
stop |
Yes |
frequency_penalty |
Yes |
presence_penalty |
Yes |
response_format |
Yes |
| Structured outputs | Yes |
Related Models
Other Microsoft Phi models in the series:
- Phi-3 series
- Phi-2
- Phi-1.5
Model Identity
| Field | Value |
|---|---|
| Name | Microsoft: Phi 4 |
| Model ID | microsoft/phi-4 |
| Short Name | Phi 4 |
| Author | Microsoft Research |
| Created | January 10, 2025 |
| HuggingFace Slug | microsoft/phi-4 |
Features
- Tool Choice Support:
literal_noneliteral_autoliteral_requiredtype_function
- Structured Output Capabilities: Yes
- Chat Completions: Enabled
- Text Completions: Enabled
Access Points
| Access Type | URL |
|---|---|
| Chat Interface | /chat?models=microsoft/phi-4 |
| Model Comparison | /compare/microsoft/phi-4 |
| Model Weights | HuggingFace (microsoft/phi-4) |
Usage Statistics
Recent daily analytics show substantial adoption with millions of tokens processed across thousands of requests. Peak activity observed on December 5, 2025 and December 18, 2025.
Data scraped from LangMart on December 23, 2025