Claude 3 Haiku

Model Overview

Property	Value
Provider	Anthropic
Model Name	Claude 3 Haiku
Model ID (for inference)	`anthropic/claude-3-haiku`
Created	March 13, 2024
Context Length	200,000 tokens
Max Output Tokens	4,096 tokens

Description

Claude 3 Haiku is Anthropic's fastest and most compact model, designed for near-instant responsiveness with quick and accurate targeted performance. It excels at tasks requiring rapid responses while maintaining high quality output.

Key characteristics include:

Near-instant response times for real-time applications
Compact model size optimized for efficiency
Strong performance on targeted tasks
Multimodal support (text and images)
Cost-effective pricing for high-volume applications

The model is ideal for use cases where speed is critical, such as chatbots, real-time assistants, content moderation, and high-throughput processing tasks.

Technical Specifications

Specification	Value
Context Window	200,000 tokens
Max Completion Tokens	4,096 tokens
Data Retention	30 days
Moderation	Required for API usage
Deprecation Date	Not announced

Pricing

Standard Pricing

Type	Rate
Input	$0.25 / 1M tokens
Output	$1.25 / 1M tokens
Image Input	$0.40 / 1K images
Input Cache Read	$0.03 / 1M tokens
Input Cache Write	$0.30 / 1M tokens

Price per Token (Detailed)

Type	Price per Token
Input	$0.00000025
Output	$0.00000125
Cache Read	$0.00000003
Cache Write	$0.0000003

Capabilities

Capability	Supported
Reasoning Mode	No
Tool/Function Calling	Yes
Vision (Image Analysis)	Yes
File Processing	No
Streaming	Yes
Caching	Yes
Multi-Part Input	Yes

Supported Parameters

Parameter	Description
`max_tokens`	Maximum number of tokens to generate (up to 4,096)
`temperature`	Controls randomness (0-1)
`top_p`	Nucleus sampling threshold
`top_k`	Top-k sampling parameter
`stop`	Stop sequences to end generation
`tools`	List of available tools/functions
`tool_choice`	Control tool selection behavior

Best Practices

For High-Volume Applications: Leverage the low cost per token for batch processing tasks
For Real-Time Chat: Take advantage of near-instant response times for conversational AI
For Cost Optimization: Use Haiku for simpler tasks, reserving larger models for complex reasoning
For Image Analysis: Utilize multimodal capability for quick image understanding tasks
For Content Moderation: Ideal for high-throughput content screening
For Caching: Use cache features for repeated context to further reduce costs

API Usage Example

LangMart Format

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ]
  }'

LangMart Format

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [
      {"role": "user", "content": "Hello, Claude!"}
    ],
    "max_tokens": 1024
  }'

With Image Input

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/jpeg;base64,..."
            }
          }
        ]
      }
    ],
    "max_tokens": 1024
  }'

With Tool Calling

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-haiku",
    "messages": [
      {"role": "user", "content": "What is the weather in Tokyo?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

Claude 3 Family

Model	Context	Use Case
Claude 3 Opus	200K tokens	Highest capability, complex reasoning
Claude 3.5 Sonnet	200K tokens	Balanced performance and efficiency
Claude 3 Haiku	200K tokens	Speed-optimized, cost-effective

Newer Generations

Model	Context	Notes
Claude 3.7 Sonnet	200K tokens	Enhanced reasoning
Claude Sonnet 4	1M tokens	Latest Sonnet generation
Claude Opus 4	1M tokens	Latest flagship model

Providers

Available Providers

Provider	Status
Anthropic	Primary
Amazon Bedrock	Available
Google Vertex	Available

Supported Modalities

Input Modalities

Text
Images

Output Modalities

Text only

Performance Characteristics

Claude 3 Haiku is optimized for:

Speed: Near-instant responsiveness for real-time applications
Efficiency: Compact model architecture for lower latency
Accuracy: Quick and accurate targeted performance
Throughput: High volume processing capability

Use Case Recommendations

Use Case	Suitability
Real-time chatbots	Excellent
Content moderation	Excellent
Quick Q&A	Excellent
High-volume processing	Excellent
Image captioning	Good
Simple tool calling	Good
Complex reasoning	Consider larger models
Long-form generation	Consider larger models

Source

LangMart Model Documentation: https://langmart.ai/model-docs
Anthropic Documentation: https://docs.anthropic.com/
Last Updated: December 23, 2025

Claude 3 Haiku

Claude 3 Haiku

Model Overview

Description

Technical Specifications

Pricing

Standard Pricing

Price per Token (Detailed)

Capabilities

Supported Parameters

Best Practices

API Usage Example

LangMart Format

LangMart Format

With Image Input

With Tool Calling

Related Models

Claude 3 Family

Newer Generations

Providers

Available Providers

Supported Modalities

Input Modalities

Output Modalities

Performance Characteristics

Use Case Recommendations

Source