0

Yi 34B Chat Model Documentation

01.AI
Vision
4K
Context
N/A
Input /1M
N/A
Output /1M
4K
Max Output

Yi 34B Chat Model Documentation

Model Overview

Property Value
Model Name Yi 34B Chat
Inference Model ID 01-ai/yi-34b-chat
Creator 01.AI
Release Date December 7, 2023
Last Updated November 10, 2025
Status Active and Available

Description

The Yi series models are large language models trained from scratch by developers at 01.AI. This 34B parameter model has been instruct-tuned specifically for chat applications, providing optimized performance for conversational tasks and instruction-following.

Technical Specifications

Architecture & Parameters

  • Total Parameters: 34 billion
  • Model Type: Instruct-tuned Chat Model
  • Training Approach: Trained from scratch by 01.AI
  • Model Format: ChatML

Context & Input/Output

Property Details
Context Window 4,096 tokens
Input Modalities Text
Output Modalities Text
Maximum Tokens 4,096 (estimated)

Stop Sequences

The model uses the following default stop sequences:

  • <|im_start|>
  • <|im_end|>
  • <|endoftext|>

Pricing

Pricing details are available through LangMart pricing API. Check the LangMart platform directly for current token rates (input/output pricing may vary by provider).

Use Cases

Ideal For

  • General chat and conversational AI applications
  • Customer service chatbots
  • Q&A systems with conversational context
  • Interactive tutoring and educational applications
  • Content generation tasks
  • Multi-turn dialogue systems
  • Complex reasoning tasks requiring extended logic chains
  • Tasks needing very long context windows (>4K tokens)
  • Vision/multimodal tasks
  • Real-time applications requiring very low latency

Integration with LangMart

To use this model in LangMart:

  1. Add 01-ai/yi-34b-chat to your provider connections
  2. Configure LangMart API key
  3. Set context window to 4,096 tokens
  4. Use ChatML format for message formatting
  5. Handle stop sequences in response parsing

Model Description

The Yi series models are large language models trained from scratch by developers at 01.AI. This 34B parameter model has been instruct-tuned specifically for chat applications, providing optimized performance for conversational tasks and instruction-following.

Capabilities & Features

Supported Capabilities

  • Chat Completion: Full support for multi-turn conversations
  • Instruction Following: Optimized for chat-based instruction execution
  • Text Generation: General purpose text generation
  • Conversation Management: Designed for extended dialogue contexts

Limitations

  • No Reasoning Features: Does not support advanced reasoning capabilities
  • Context Limitation: 4,096 token context window may limit longer conversations
  • Text-Only: Does not support image, audio, or other modalities

Model Weights & Implementation

  • Official Weights: Available on Hugging Face
  • Repository: 01-ai/Yi-34B-Chat
  • Publicly Available: Yes - fully open-source model

Integration & Usage

LangMart API Usage

# Chat Completions
curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "01-ai/yi-34b-chat",
    "messages": [
      {
        "role": "user",
        "content": "Hello! How are you today?"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 2048
  }'

Python Example

import requests

response = requests.post(
    "https://api.langmart.ai/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "01-ai/yi-34b-chat",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Explain the concept of machine learning."
            }
        ],
        "temperature": 0.7,
        "max_tokens": 2048,
        "top_p": 0.9
    }
)

print(response.json())

JavaScript/Node.js Example

const response = await fetch("https://api.langmart.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "01-ai/yi-34b-chat",
    messages: [
      {
        role: "user",
        content: "What are the benefits of using LLMs?"
      }
    ],
    temperature: 0.7,
    max_tokens: 2048
  })
});

const data = await response.json();
console.log(data);
Parameter Recommended Value Range
temperature 0.7 0.0 - 2.0
top_p 0.9 0.0 - 1.0
max_tokens 1024-2048 1 - 4096
frequency_penalty 0.0 -2.0 - 2.0
presence_penalty 0.0 -2.0 - 2.0

Comparison with Similar Models

Model Parameters Context Type
Yi 34B Chat 34B 4,096 Chat-Optimized
Llama 2 34B 34B 4,096 General
Mistral 34B 34B ~8,000 General

Model Availability

  • LangMart Status: ✓ Available
  • Direct Access: Yes (Hugging Face)
  • Deprecation Status: None indicated
  • Marketplace Visibility: Public and fully visible

Inference Considerations

Performance Expectations

  • Typical Latency: ~1-3 seconds for standard queries (depends on provider)
  • Throughput: Suitable for standard production workloads
  • Memory Requirements: Approximately 64-80 GB VRAM for full model inference
  • Quantization: Available in various bit formats (4-bit, 8-bit, etc.)

Best Practices

  1. Batching: For high-volume scenarios, consider batch processing
  2. Temperature Tuning: Adjust temperature based on desired creativity/determinism
  3. Token Budget: Plan for context + max_tokens to avoid truncation
  4. Stop Sequences: Ensure proper handling of ChatML stop tokens
  5. Error Handling: Implement retry logic for API rate limiting

Support & Documentation

For more information:

Additional Resources

  • Model Card: Available on Hugging Face
  • Training Documentation: 01.AI official documentation
  • Community Discussion: Hugging Face model page
  • API Documentation: LangMart API docs

Documentation generated from LangMart AI Model Registry Last updated: 2025-12-23