Nous Research: Hermes 3 70B Instruct
Description
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. It represents a competitive, if not superior, finetune of the Llama-3.1 70B foundation model, focused on aligning LLMs to the user with powerful steering capabilities and control given to the end user.
Pricing
Input Pricing: $0.30 per 1M tokens
Output Pricing: $0.30 per 1M tokens
Cost Ratio: 1:1 (equal input/output pricing)
Cost Profile: Excellent value for 70B model with large context
Capabilities
- Advanced agentic capabilities
- Excellent roleplaying
- Advanced reasoning
- Multi-turn conversations
- Long context coherence
- Function calling and structured output
- Generalist assistant tasks
- Code generation
- Tool use and integration
- Instruction following
Use Cases
- Conversational AI - Advanced multi-turn chat
- Agent Systems - Tool-using autonomous agents
- Function Calling - Reliable structured outputs
- Roleplaying - Natural character interactions
- Content Generation - High-quality text creation
- Code Generation - Strong coding capabilities
- Research Support - Long document analysis
- Business Logic - Decision support and automation
Integration with LangMart
Gateway Support: Type 2 (Cloud), Type 3 (Self-hosted)
Recommended Setup:
./core.sh start 2 # Cloud gateway
./core.sh start 3 # Self-hosted gateway
API Usage:
curl -X POST https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "nousresearch/hermes-3-llama-3.1-70b",
"messages": [{"role": "user", "content": "Help me solve this problem..."}],
"max_tokens": 2048
}'
Function Calling Example:
{
"model": "nousresearch/hermes-3-llama-3.1-70b",
"messages": [{"role": "user", "content": "Call this function..."}],
"tools": [{"type": "function", "function": {...}}],
"tool_choice": "auto"
}
Related Models
From Nous Research:
- Hermes 4 70B - Next-generation with hybrid reasoning
- Hermes 4 405B - Frontier-level variant
- Hermes 3 405B - Larger variant
- Hermes 2 Mixtral 8x7B - Smaller alternative
- DeepHermes 3 - Specialized reasoning variant
Model Information
Model ID (API): nousresearch/hermes-3-llama-3.1-70b
Provider: Nous Research (via NextBit)
Release Date: August 18, 2024
Base Model: Meta Llama-3.1 70B
Model Architecture: Transformer-based fine-tuned from Llama 3.1
Parameters: 70 billion
Context Window: 65,536 tokens
Quantization: FP8 (via NextBit)
Input/Output Specifications
Input Modalities: Text
Output Modalities: Text
Max Context: 65,536 tokens (2x larger than many competitors)
Instruction Format: ChatML
Stop Tokens: Standard Llama 3.1 format
Performance Metrics
Daily Usage (December 2025):
- Average Requests: 50,000-90,000 per day
- Completion Tokens: 8-13 million daily
- Consistent adoption and usage
- Stable performance metrics
Trending: Growing adoption as production model
Reliability: Enterprise-grade stability
Model Capabilities & Features
Agentic Capabilities
- Function Calling: Excellent reliability
- Tool Use: Advanced tool integration
- Structured Output: Strict schema adherence
- Complex Tasks: Multi-step reasoning
- Task Planning: Sequential action planning
Assistant Capabilities
- Roleplaying: Natural character embodiment
- Reasoning: Advanced multi-step logic
- Conversations: Excellent multi-turn handling
- Context: 65K token context retention
- Instruction Following: Superior compliance
Supported Parameters
- Temperature control (0-2)
- Top-p sampling
- Max tokens
- Stop sequences
- Tool choice (auto, required, specific)
- Function calling parameters
- Safety response formatting
Strengths
- Large 70B parameter model
- Extended 65K token context
- Advanced reasoning capabilities
- Excellent function calling
- Versatile instruction following
- Strong roleplay capabilities
- Production-ready deployment
- Good reasoning performance
- Reliable tool integration
Comparison with Hermes 2
Improvements over Hermes 2:
- Better agentic capabilities
- Superior reasoning
- Improved multi-turn handling
- Extended context window
- Enhanced instruction following
- Better tool use reliability
- Improved alignment
- More flexible steering
Extended Context Advantages
65K Context Window Enables:
- Full document analysis
- Extended conversations
- Complex multi-document tasks
- Large codebase understanding
- Long-form content generation
- Comprehensive context retention
- Advanced reasoning over large contexts
- Novel information synthesis
Agentic Use Cases
Agent Design Patterns:
- Tool-calling agents
- Multi-step task planning
- Conversational agents
- Decision-support systems
- Autonomous workflow execution
- Complex problem decomposition
Fine-tuning Approach
Training Method: Advanced instruction tuning
Alignment: User-centric alignment
Safety: Balanced safety and helpfulness
Steering: Powerful user control capabilities
Optimization: Nous Research proprietary techniques
Deployment Characteristics
Quantization: FP8 (NextBit optimized)
Performance: Excellent speed/quality balance
Scalability: Suitable for large-scale deployments
Optimization: NextBit hosting optimization
Performance Recommendations
Best For:
- Agent-based systems
- Complex reasoning tasks
- Extended context needs
- Function-heavy applications
- Production deployments
- Instruction-following tasks
- Multi-turn conversations
- Research and analysis
Ideal Applications:
- Autonomous agents
- Tool-calling systems
- Conversational interfaces
- Content platforms
- Analysis and research
- Business automation
- Creative applications
Deployment Characteristics
Recommended For:
- Production systems
- Enterprise deployment
- High-throughput services
- Scalable applications
- Agent-based architectures
- Long-context workloads
Strengths:
- Extended 65K context
- Advanced agentic capabilities
- Strong reasoning
- Production-ready
- Flexible steering
- Excellent function calling
- Good cost-performance
Testing Results
Reasoning: Excellent (advanced agentic)
Function Calling: Highly reliable
Code Generation: Strong quality (70B model)
Context Handling: Superior (65K tokens)
Instruction Following: Excellent compliance
Roleplaying: Natural and engaging
Quality Metrics
- Function Calling Accuracy: 95%+
- Instruction Following: 95%+
- Reasoning Quality: Excellent
- Code Correctness: High (90%+)
- Context Preservation: Excellent
- Agent Reliability: Production-grade
Nous Research Philosophy
- User-aligned models
- Powerful steering capabilities
- End-user control
- Advanced capabilities
- Instruction tuning excellence
- Agentic focus
- Production reliability
References
- LangMart Model Documentation: https://langmart.ai/model-docs
- Nous Research: https://www.nousresearch.com/
- NextBit Hosting: https://www.nextbit.ai/
- Llama 3.1 Base: https://huggingface.co/meta-llama/Llama-3.1-70B
Last Updated: December 24, 2025