N

Nous: Hermes 2 Mixtral 8x7B DPO

Nous Research
33K
Context
N/A
Input /1M
N/A
Output /1M
N/A
Max Output

Nous: Hermes 2 Mixtral 8x7B DPO

Description

Nous Hermes 2 Mixtral 8x7B DPO is the flagship Nous Research model trained over the Mixtral 8x7B MoE LLM. The model was trained on over 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape, achieving state of the art performance on a variety of tasks.

Pricing

Input Pricing: Not disclosed on platform

Output Pricing: Not disclosed on platform

Note: Check LangMart pricing page for current rates

Capabilities

  • Text-to-text inference
  • ChatML instruction format
  • Direct Preference Optimization (DPO) tuning
  • Mixture of Experts routing
  • Function calling support
  • Structured output generation
  • Multi-turn conversations
  • nousresearch/hermes-3-llama-3.1-405b - Latest Hermes model on Llama 3.1
  • mistralai/mixtral-8x7b-instruct - Base Mixtral model
  • nousresearch/nous-hermes-2-mixtral-8x7b-sft - SFT version without DPO

Model Information

Model ID (API): nousresearch/nous-hermes-2-mixtral-8x7b-dpo

Provider: Nous Research

Release Date: January 16, 2024

Model Architecture: Mixture of Experts (MoE) - Mixtral 8x7B base

Parameters: 8x7B (56B effective)

Context Window: 32,768 tokens

Input/Output Specifications

Input Modalities: Text

Output Modalities: Text

Instruction Format: ChatML

Stop Sequences:

  • <|im_start|>
  • <|im_end|>
  • <|endoftext|>

Performance Metrics

  • Training Data: 1,000,000+ high-quality entries
  • Data Mix: Primarily GPT-4 generated with open-source datasets
  • Optimization Method: Direct Preference Optimization (DPO)
  • Recent Activity: Limited usage data available

Model Capabilities & Features

Supported Parameters

  • Temperature control
  • Top-p sampling
  • Stop sequences
  • Max tokens
  • Frequency/presence penalties

Strengths

  • Excellent reasoning capabilities
  • Strong instruction following
  • Code generation abilities
  • Improved performance from DPO tuning
  • Effective with function calling
  • Good multi-turn conversation handling

API Usage

LangMart Endpoint

POST https://api.langmart.ai/v1/chat/completions

Request Format (OpenAI-Compatible)

{
  "model": "nousresearch/nous-hermes-2-mixtral-8x7b-dpo",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful AI assistant."
    },
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}

cURL Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nousresearch/nous-hermes-2-mixtral-8x7b-dpo",
    "messages": [
      {
        "role": "user",
        "content": "Explain the concept of mixture of experts in machine learning."
      }
    ]
  }'

Python Example

import openai

client = openai.OpenAI(
    base_url="https://api.langmart.ai/v1",
    api_key="YOUR_LANGMART_API_KEY"
)

response = client.chat.completions.create(
    model="nousresearch/nous-hermes-2-mixtral-8x7b-dpo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is DPO training?"}
    ],
    temperature=0.7,
    max_tokens=1024
)

print(response.choices[0].message.content)

LangMart Usage

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nousresearch/nous-hermes-2-mixtral-8x7b-dpo",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Configurable Parameters

Parameter Type Default Description
temperature float 1.0 Controls randomness (0.0-2.0)
max_tokens integer - Maximum tokens to generate
top_p float 1.0 Nucleus sampling parameter
top_k integer - Top-k sampling parameter
frequency_penalty float 0.0 Penalize frequent tokens
presence_penalty float 0.0 Penalize repeated tokens
stop array - Custom stop sequences

ChatML Format

This model uses the ChatML instruction format:

<|im_start|>system
You are a helpful AI assistant.<|im_end|>
<|im_start|>user
Hello!<|im_end|>
<|im_start|>assistant
Hello! How can I help you today?<|im_end|>

Available Providers

This model is available through various inference providers including:

  • OpenRouter
  • Together AI
  • Fireworks AI
  • Other providers hosting Mixtral-based models

Training Details

  • Training Method: Direct Preference Optimization (DPO)
  • Training Data: Over 1,000,000 entries
  • Data Sources: Primarily GPT-4 generated data + high-quality open datasets
  • Architecture: 8 experts with 7B parameters each, 2 experts active per inference

Resources


Last updated: December 2024 Source: LangMart