N

Nous: Hermes 2 Mistral 7B DPO

Nous Research
32K
Context
N/A
Input /1M
N/A
Output /1M
N/A
Max Output

Nous: Hermes 2 Mistral 7B DPO

Description

This represents the primary 7B variant in the Hermes lineup, utilizing Direct Preference Optimization refinement. It's derived from Teknium/OpenHermes-2.5-Mistral-7B and demonstrates "improvement across the board on all benchmarks tested - AGIEval, BigBench Reasoning, GPT4All, and TruthfulQA."

The underlying model was trained on approximately one million high-quality instructions and conversations, predominantly from synthetic datasets and other premium sources. The training data consists of GPT-4 quality or better instructions/chats.

Technical Specifications

Model Architecture

  • Architecture Group: Mistral
  • Parameter Count: 7 Billion
  • Model Family: Mistral-based
  • Context Window: 8,192 tokens
  • Instruction Format: ChatML

Input/Output

  • Input Modalities: Text
  • Output Modalities: Text
  • Default Stop Sequences: <|im_start|>, <|im_end|>, <|endoftext|>

Training Data

  • Training Set Size: 1,000,000 instructions/conversations
  • Data Quality: GPT-4 quality or better
  • Data Sources: Primarily synthetic datasets and premium training sources
  • Fine-tuning Technique: Direct Preference Optimization (DPO)

Capabilities

  • Text-based Conversations: Excellent performance in chat and dialogue scenarios
  • Instruction Following: Robust instruction understanding and execution
  • Reasoning Tasks: Strong performance on reasoning benchmarks
  • Problem Solving: Capable of tackling complex problems across various domains

Similar 7B-sized models in the Mistral/Hermes family:

  • Mistral 7B Instruct
  • OpenHermes 2.5 Mistral
  • Neural Chat 7B
  • Zephyr 7B

Model Information

Property Value
Context Window 32,000 tokens
Model Name Nous: Hermes 2 Mistral 7B DPO
Inference Model ID nousresearch/nous-hermes-2-mistral-7b-dpo
Organization Nous Research
Release Date February 21, 2024
Model Type Large Language Model (LLM)
Base Model Mistral 7B
Fine-tuning Method Direct Preference Optimization (DPO)
Hugging Face Weights NousResearch/Nous-Hermes-2-Mistral-7B-DPO

Benchmark Performance

The model demonstrates improvements across multiple evaluation benchmarks:

Benchmark Status
AGIEval Improved
BigBench Reasoning Improved
GPT4All Improved
TruthfulQA Improved

Pricing & Provider Information

Note: As of the current data collection, LangMart reports insufficient analytics data to display pricing and provider information. Pricing may vary by provider and region.

Typical Use Cases for Pricing Information

  • Check LangMart's platform for real-time pricing
  • Monitor for updates as provider data becomes available
  • Regional variations may apply

Usage Examples

Basic Chat Completion

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nousresearch/nous-hermes-2-mistral-7b-dpo",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
  }'

Python Example

import requests

api_key = "YOUR_API_KEY"
model_id = "nousresearch/nous-hermes-2-mistral-7b-dpo"

response = requests.post(
    "https://api.langmart.ai/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    },
    json={
        "model": model_id,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What are the main features of this model?"}
        ],
        "temperature": 0.7,
        "max_tokens": 512
    }
)

print(response.json())

JavaScript Example

const apiKey = "YOUR_API_KEY";
const modelId = "nousresearch/nous-hermes-2-mistral-7b-dpo";

const response = await fetch("https://api.langmart.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: modelId,
    messages: [
      { role: "user", content: "How does Direct Preference Optimization work?" }
    ],
    temperature: 0.8,
    max_tokens: 1024
  })
});

const data = await response.json();
console.log(data);

Model Parameters

Common parameters when using this model with OpenAI-compatible APIs:

Parameter Type Default Recommended Range Description
temperature Float 1.0 0.0 - 2.0 Controls randomness (0 = deterministic, 2 = very random)
max_tokens Integer - 1 - 8192 Maximum tokens to generate in response
top_p Float 1.0 0.0 - 1.0 Nucleus sampling parameter
top_k Integer - 1+ Top-K sampling (number of highest probability tokens)
frequency_penalty Float 0 -2.0 - 2.0 Penalizes repeated tokens
presence_penalty Float 0 -2.0 - 2.0 Penalizes tokens based on presence in text

Supported Formats

  • Instruction Format: ChatML
  • API Format: OpenAI-compatible
  • Streaming: Supported
  • Function Calling: May be supported depending on provider

Integration Points

OpenRouter

Available on LangMart.ai with the inference ID: nousresearch/nous-hermes-2-mistral-7b-dpo

LangMart Registry

Can be integrated into LangMart with the following configuration:

{
  "modelId": "nousresearch/nous-hermes-2-mistral-7b-dpo",
  "name": "Nous Hermes 2 Mistral 7B DPO",
  "provider": "openrouter",
  "capabilities": {
    "chat": true,
    "completion": true,
    "streaming": true
  },
  "context_window": 8192,
  "training_data_cutoff": "2024-02-21"
}

Notes

  • This model is optimized for instruction-following and conversational tasks
  • Direct Preference Optimization (DPO) fine-tuning resulted in improved performance across multiple benchmarks
  • The 8K context window is suitable for most general-purpose tasks
  • Performance characteristics may vary depending on the hosting provider

Document Metadata

Field Value
Created 2024-02-21
Last Updated 2025-12-23
Source LangMart.ai
Status Active

Additional Resources