LangMart: Mistral: Mixtral 8x7B Instruct
Model Overview
| Property | Value |
|---|---|
| Model ID | openrouter/mistralai/mixtral-8x7b-instruct |
| Name | Mistral: Mixtral 8x7B Instruct |
| Provider | mistralai |
| Released | 2023-12-10 |
Description
Mixtral 8x7B Instruct is a pretrained generative Sparse Mixture of Experts, by Mistral AI, for chat and instruction use. Incorporates 8 experts (feed-forward networks) for a total of 47 billion parameters.
Instruct model fine-tuned by Mistral. #moe
Description
LangMart: Mistral: Mixtral 8x7B Instruct is a language model provided by mistralai. This model offers advanced capabilities for natural language processing tasks.
Provider
mistralai
Specifications
| Spec | Value |
|---|---|
| Context Window | 32,768 tokens |
| Modalities | text->text |
| Input Modalities | text |
| Output Modalities | text |
Pricing
| Type | Price |
|---|---|
| Input | $0.54 per 1M tokens |
| Output | $0.54 per 1M tokens |
Capabilities
- Frequency penalty
- Logit bias
- Max tokens
- Min p
- Presence penalty
- Repetition penalty
- Response format
- Seed
- Stop
- Temperature
- Tool choice
- Tools
- Top k
- Top p
Detailed Analysis
Mixtral 8x7B Instruct is Mistral's groundbreaking Sparse Mixture of Experts (SMoE) model featuring 8 expert networks of 7B parameters each (47B total parameters) but only activating 13B parameters per token through intelligent routing. This architectural innovation delivers near-50B model quality at 13B computational cost - approximately 6x faster inference than dense 70B models while outperforming Llama 2 70B on most benchmarks. The router network dynamically selects the 2 most relevant experts per token, enabling specialization (one expert for code, another for math, etc.) while maintaining efficiency. Mixtral 8x7B excels at diverse tasks: complex code generation, mathematical reasoning, multilingual understanding (supporting 32K context across languages), and general reasoning. The model supports function calling and JSON mode for agentic applications. Released under Apache 2.0, it revolutionized open-source AI by proving MoE architectures could match proprietary models. Ideal for applications requiring near-frontier performance at fraction of computational cost, self-hosting scenarios, and research into MoE architectures.