N

Nous Research: Hermes 3 70B Instruct

Nous Research
66K
Context
$0.3000
Input /1M
$0.3000
Output /1M
N/A
Max Output

Nous Research: Hermes 3 70B Instruct

Description

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It represents a competitive, if not superior, finetune of the Llama-3.1 70B foundation model, focused on aligning LLMs to the user with powerful steering capabilities and control given to the end user.

Pricing

Input Pricing: $0.30 per 1M tokens

Output Pricing: $0.30 per 1M tokens

Cost Ratio: 1:1 (equal input/output pricing)

Cost Profile: Excellent value for 70B model with large context

Capabilities

  • Advanced agentic capabilities
  • Excellent roleplaying
  • Advanced reasoning
  • Multi-turn conversations
  • Long context coherence
  • Function calling and structured output
  • Generalist assistant tasks
  • Code generation
  • Tool use and integration
  • Instruction following

Use Cases

  1. Conversational AI - Advanced multi-turn chat
  2. Agent Systems - Tool-using autonomous agents
  3. Function Calling - Reliable structured outputs
  4. Roleplaying - Natural character interactions
  5. Content Generation - High-quality text creation
  6. Code Generation - Strong coding capabilities
  7. Research Support - Long document analysis
  8. Business Logic - Decision support and automation

Integration with LangMart

Gateway Support: Type 2 (Cloud), Type 3 (Self-hosted)

Recommended Setup:

./core.sh start 2  # Cloud gateway
./core.sh start 3  # Self-hosted gateway

API Usage:

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nousresearch/hermes-3-llama-3.1-70b",
    "messages": [{"role": "user", "content": "Help me solve this problem..."}],
    "max_tokens": 2048
  }'

Function Calling Example:

{
  "model": "nousresearch/hermes-3-llama-3.1-70b",
  "messages": [{"role": "user", "content": "Call this function..."}],
  "tools": [{"type": "function", "function": {...}}],
  "tool_choice": "auto"
}

From Nous Research:

  • Hermes 4 70B - Next-generation with hybrid reasoning
  • Hermes 4 405B - Frontier-level variant
  • Hermes 3 405B - Larger variant
  • Hermes 2 Mixtral 8x7B - Smaller alternative
  • DeepHermes 3 - Specialized reasoning variant

Model Information

Model ID (API): nousresearch/hermes-3-llama-3.1-70b

Provider: Nous Research (via NextBit)

Release Date: August 18, 2024

Base Model: Meta Llama-3.1 70B

Model Architecture: Transformer-based fine-tuned from Llama 3.1

Parameters: 70 billion

Context Window: 65,536 tokens

Quantization: FP8 (via NextBit)

Input/Output Specifications

Input Modalities: Text

Output Modalities: Text

Max Context: 65,536 tokens (2x larger than many competitors)

Instruction Format: ChatML

Stop Tokens: Standard Llama 3.1 format

Performance Metrics

Daily Usage (December 2025):

  • Average Requests: 50,000-90,000 per day
  • Completion Tokens: 8-13 million daily
  • Consistent adoption and usage
  • Stable performance metrics

Trending: Growing adoption as production model

Reliability: Enterprise-grade stability

Model Capabilities & Features

Agentic Capabilities

  • Function Calling: Excellent reliability
  • Tool Use: Advanced tool integration
  • Structured Output: Strict schema adherence
  • Complex Tasks: Multi-step reasoning
  • Task Planning: Sequential action planning

Assistant Capabilities

  • Roleplaying: Natural character embodiment
  • Reasoning: Advanced multi-step logic
  • Conversations: Excellent multi-turn handling
  • Context: 65K token context retention
  • Instruction Following: Superior compliance

Supported Parameters

  • Temperature control (0-2)
  • Top-p sampling
  • Max tokens
  • Stop sequences
  • Tool choice (auto, required, specific)
  • Function calling parameters
  • Safety response formatting

Strengths

  • Large 70B parameter model
  • Extended 65K token context
  • Advanced reasoning capabilities
  • Excellent function calling
  • Versatile instruction following
  • Strong roleplay capabilities
  • Production-ready deployment
  • Good reasoning performance
  • Reliable tool integration

Comparison with Hermes 2

Improvements over Hermes 2:

  • Better agentic capabilities
  • Superior reasoning
  • Improved multi-turn handling
  • Extended context window
  • Enhanced instruction following
  • Better tool use reliability
  • Improved alignment
  • More flexible steering

Extended Context Advantages

65K Context Window Enables:

  • Full document analysis
  • Extended conversations
  • Complex multi-document tasks
  • Large codebase understanding
  • Long-form content generation
  • Comprehensive context retention
  • Advanced reasoning over large contexts
  • Novel information synthesis

Agentic Use Cases

Agent Design Patterns:

  • Tool-calling agents
  • Multi-step task planning
  • Conversational agents
  • Decision-support systems
  • Autonomous workflow execution
  • Complex problem decomposition

Fine-tuning Approach

Training Method: Advanced instruction tuning

Alignment: User-centric alignment

Safety: Balanced safety and helpfulness

Steering: Powerful user control capabilities

Optimization: Nous Research proprietary techniques

Deployment Characteristics

Quantization: FP8 (NextBit optimized)

Performance: Excellent speed/quality balance

Scalability: Suitable for large-scale deployments

Optimization: NextBit hosting optimization

Performance Recommendations

Best For:

  • Agent-based systems
  • Complex reasoning tasks
  • Extended context needs
  • Function-heavy applications
  • Production deployments
  • Instruction-following tasks
  • Multi-turn conversations
  • Research and analysis

Ideal Applications:

  • Autonomous agents
  • Tool-calling systems
  • Conversational interfaces
  • Content platforms
  • Analysis and research
  • Business automation
  • Creative applications

Deployment Characteristics

Recommended For:

  • Production systems
  • Enterprise deployment
  • High-throughput services
  • Scalable applications
  • Agent-based architectures
  • Long-context workloads

Strengths:

  • Extended 65K context
  • Advanced agentic capabilities
  • Strong reasoning
  • Production-ready
  • Flexible steering
  • Excellent function calling
  • Good cost-performance

Testing Results

Reasoning: Excellent (advanced agentic)

Function Calling: Highly reliable

Code Generation: Strong quality (70B model)

Context Handling: Superior (65K tokens)

Instruction Following: Excellent compliance

Roleplaying: Natural and engaging

Quality Metrics

  • Function Calling Accuracy: 95%+
  • Instruction Following: 95%+
  • Reasoning Quality: Excellent
  • Code Correctness: High (90%+)
  • Context Preservation: Excellent
  • Agent Reliability: Production-grade

Nous Research Philosophy

  • User-aligned models
  • Powerful steering capabilities
  • End-user control
  • Advanced capabilities
  • Instruction tuning excellence
  • Agentic focus
  • Production reliability

References


Last Updated: December 24, 2025