Meta Llama 2 13B Chat

Model Overview

Model ID: meta-llama/llama-2-13b-chat Creator: Meta (Llama Team) Release Date: June 20, 2023 Model Type: Chat-Optimized Language Model

Description

Meta Llama 2 13B Chat is a 13 billion parameter language model fine-tuned specifically for chat completions and conversational tasks. This is Meta's open-source contribution designed for dialogue-based applications and instruction-following capabilities.

Technical Specifications

Model Architecture

Parameter Count: 13 billion parameters
Model Family: Llama 2
Instruction Type: Llama2
Fine-tuning: Chat-optimized through instruction fine-tuning

Input/Output Configuration

Context Window: 4,096 tokens
Input Modalities: Text
Output Modalities: Text
Default Stop Sequences: </s>, [INST]

Model Variants

Full Model: meta-llama/Llama-2-13b-chat-hf (Hugging Face)
LangMart Endpoint: meta-llama/llama-2-13b-chat

Pricing

Note: Pricing information varies by provider and API platform. On LangMart, check the model pricing page for current rates. Generally:

Input tokens: Typically $0.1 per 1M tokens (subject to variation)
Output tokens: Typically $0.1 per 1M tokens (subject to variation)
Consult LangMart pricing directly for exact rates

Performance Characteristics

Capabilities

Chat-optimized for conversational tasks
High-quality text generation and completion
Multi-turn conversation support
Instruction following and chat-based reasoning
Training-friendly for fine-tuning on domain-specific tasks

Context and Limitations

4,096 token context window (suitable for most conversations)
Optimized for chat interactions rather than general text processing
Open-source model with community support
No specific performance benchmarks provided on LangMart

API Integration

LangMart API Usage

# Request Format
curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-2-13b-chat",
    "messages": [
      {
        "role": "user",
        "content": "Hello, how can you help me?"
      }
    ],
    "max_tokens": 2048,
    "temperature": 0.7
  }'

API Parameters

Parameter	Type	Description	Default
Context Window	4,096 tokens
`model`	string	Model identifier	Required: `meta-llama/llama-2-13b-chat`
`messages`	array	Conversation history	Required
`max_tokens`	integer	Maximum response length	2048
`temperature`	float	Response randomness (0-2)	0.7
`top_p`	float	Nucleus sampling parameter	1.0
`frequency_penalty`	float	Reduce repetition	0.0
`presence_penalty`	float	Encourage new tokens	0.0
`stop`	array	Stop sequences	`["</s>", "[INST]"]`

Usage Examples

Basic Chat Completion

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-2-13b-chat",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms."
      }
    ]
  }'

Multi-Turn Conversation

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-2-13b-chat",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of France?"
      },
      {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      {
        "role": "user",
        "content": "What is its population?"
      }
    ]
  }'

Code Generation

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-2-13b-chat",
    "messages": [
      {
        "role": "user",
        "content": "Write a Python function to calculate factorial."
      }
    ],
    "temperature": 0.5
  }'

Creative Writing

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-2-13b-chat",
    "messages": [
      {
        "role": "user",
        "content": "Write a short science fiction story about space exploration."
      }
    ],
    "temperature": 0.9,
    "max_tokens": 1024
  }'

Model Availability

Platform	Status	Notes
Hugging Face	Available	Model: `meta-llama/Llama-2-13b-chat-hf`
LangMart	Available	Accessible via API endpoint
Replicate	Available	Inference platform option
Together AI	Available	API endpoint available
Other Providers	Variable	Check provider status

Training and Fine-tuning

This model is suitable for fine-tuning on:

Domain-specific chat applications
Customer service automation
Q&A systems
Dialogue-based applications
Instruction-following tasks

Requires appropriate licensing and resources for fine-tuning on custom datasets.

Safety and Ethical Considerations

Built with safety techniques from Meta's responsible AI research
Suitable for production deployment with appropriate monitoring
Community-maintained with open governance
Designed to reduce harmful outputs through instruction fine-tuning

Comparison with Other Models

Similar Models

Llama 2 7B Chat: Smaller, faster variant (7B parameters)
Llama 2 70B Chat: Larger, more capable variant (70B parameters)
Mistral 7B Instruct: Similar size, alternative architecture
Neural Chat 7B: Instruction-tuned alternative

Size vs Capability Tradeoff

7B: Faster inference, lower memory, acceptable quality
13B: Balance of speed and quality (recommended for most use cases)
70B: Best quality, slower inference, higher resource requirements

Integration Notes

LangMart/LangChain Integration

from openai import OpenAI  # LangMart compatible

client = OpenAI(
    model_name="meta-llama/llama-2-13b-chat",
    api_key="YOUR_API_KEY",
    temperature=0.7,
    max_tokens=2048
)

response = llm("Hello, how can you help me?")

OpenAI-Compatible Endpoint

The model is compatible with OpenAI-style API calls:

import openai

openai.api_key = "YOUR_LANGMART_API_KEY"
openai.api_base = "https://api.langmart.ai/v1"

response = openai.ChatCompletion.create(
    model="meta-llama/llama-2-13b-chat",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

Last Updated

November 10, 2025

Resources

Model Card: Hugging Face - Llama-2-13b-chat
Paper: Llama 2: Open Foundation and Fine-Tuned Chat Models
LangMart: https://langmart.ai/model-docs
License: Llama 2 Community License Agreement

This documentation was generated from LangMart model data. For the most current information, visit the LangMart model documentation.