O

OpenAI o1-preview

OpenAI
Vision
128K
Context
$15.00
Input /1M
$60.00
Output /1M
33K
Max Output

OpenAI o1-preview

Model Overview

Property Value
Model ID openai/o1-preview
Full Name OpenAI: o1-preview
Provider OpenAI
Release Date September 12, 2024
Type Reasoning Language Model (LLM)
Architecture Transformer-based with Chain-of-Thought Reasoning

Description

OpenAI o1-preview is a reasoning-focused model designed to "spend more time thinking before responding." It employs chain-of-thought reasoning with self-fact-checking capabilities, making it particularly powerful for complex problem-solving tasks.

Key characteristics:

  • Extended Reasoning: Uses internal reasoning tokens to think through problems step-by-step
  • PhD-Level Performance: Surpassed human PhD-level performance on the GPQA diamond benchmark
  • Math Olympiad: Achieved 83% accuracy on the International Mathematics Olympiad qualifying exam
  • Competitive Programming: 89th percentile performance on Codeforces
  • STEM Optimization: Specifically designed for math, science, programming, and other STEM-related tasks

Note: This model is currently experimental and not recommended for production use-cases. It may be subject to heavy rate limiting.

Technical Specifications

Context & Token Limits

Parameter Value
Context Length 128,000 tokens
Max Completion Tokens 32,768 tokens
Training Data Cutoff October 1, 2023
Version o1-preview-2024-09-12

Input/Output Modalities

Modality Support
Text Input Yes
Image Input No (beta limitation)
File Input No
Text Output Yes
Image Output No
Audio Output No

Reasoning Tokens

The o1-preview model uses hidden "reasoning tokens" internally to process complex problems. These tokens:

  • Are consumed from your context window
  • Are billed at the same rate as output tokens
  • Are not visible in the API response
  • OpenAI recommends reserving at least 25,000 tokens for reasoning and outputs

Pricing

Standard Pricing (Per Token)

Type Cost per Token Cost per Million Tokens
Input $0.000015 $15.00
Output $0.00006 $60.00
Reasoning Tokens $0.00006 $60.00 (same as output)

Cost Comparison

Model Input (per 1M) Output (per 1M)
o1-preview $15.00 $60.00
GPT-4o $2.50 $10.00
GPT-4o Mini $0.15 $0.60

Note: o1-preview is significantly more expensive than standard models due to its advanced reasoning capabilities.

Capabilities

Core Capabilities

  • Advanced Reasoning: Complex logical reasoning through chain-of-thought
  • Mathematical Problem-Solving: Olympiad-level math capabilities
  • Scientific Analysis: PhD-level performance on physics, chemistry, biology
  • Code Generation: Expert-level code analysis and generation
  • Self-Verification: Built-in fact-checking during reasoning

Use Cases

  • Advanced scientific and mathematical problem-solving
  • Expert-level code analysis and debugging
  • Critical domain reasoning (biomedicine, law, finance)
  • Transparent chain-of-thought applications
  • High-stakes decision support systems
  • Research and complex analysis tasks

Limitations

  • No vision/image processing capability
  • No tool calling or function use
  • No fine-tuning available
  • No streaming support
  • Higher latency due to reasoning process
  • Significantly higher cost than standard models

Supported Parameters

Available Parameters (Beta)

Parameter Type Description
model string Model identifier (o1-preview)
messages array User and assistant messages only
max_tokens integer Maximum tokens to generate (up to 32,768)
max_completion_tokens integer Alternative to max_tokens

Fixed Parameters (Cannot Be Modified)

Parameter Fixed Value Notes
temperature 1 Cannot be adjusted during beta
top_p 1 Cannot be adjusted during beta
n 1 Only single completions supported
presence_penalty 0 Cannot be adjusted during beta
frequency_penalty 0 Cannot be adjusted during beta

Not Supported (Beta Limitations)

Feature Status
System Messages Not supported
Streaming Not supported
Tool/Function Calling Not supported
Image Inputs Not supported
Logprobs Not supported
Stop Sequences Not supported
Web Search Not supported

Best Practices

When to Use o1-preview

  • Complex multi-step mathematical problems
  • Scientific reasoning requiring domain expertise
  • Advanced coding challenges (algorithms, system design)
  • Tasks requiring careful logical analysis
  • Problems where accuracy is more important than speed
  • Research and academic problem-solving

When to Consider Alternatives

  • Simple tasks (use GPT-4o Mini for cost savings)
  • Tasks requiring images/vision (use GPT-4o)
  • Tasks requiring tool/function calling (use GPT-4o)
  • Real-time applications requiring low latency (use GPT-4o Mini)
  • Production workloads (o1-preview is experimental)

Optimization Tips

  1. Reserve sufficient tokens: Leave at least 25,000 tokens for reasoning and output
  2. Be detailed in prompts: The model benefits from comprehensive problem descriptions
  3. Avoid system messages: They are not supported; include instructions in user message
  4. Expect higher latency: The model takes longer to respond due to reasoning
  5. Budget for costs: Output tokens (including hidden reasoning) are expensive
  6. Handle errors gracefully: The model may be rate-limited

Prompt Engineering for o1

Since o1-preview cannot use system messages, structure your prompts differently:

# Instead of system message, include instructions in the user prompt
messages = [
    {
        "role": "user",
        "content": """You are an expert mathematician. Please solve the following
        problem step by step, showing all your work and explaining your reasoning.

        Problem: [Your problem here]

        Please provide a complete solution with explanation."""
    }
]
Model Comparison
o1-mini Faster, cheaper, smaller context; good for coding tasks
o1 Full version with 200K context (if available)
GPT-4o General purpose, multimodal, supports tools and vision
Claude 3.5 Sonnet Anthropic competitor with different reasoning approach
Gemini 2.0 Flash Thinking Google's reasoning model alternative

Provider Information

Primary Provider: OpenAI

Property Value
Base URL https://api.langmart.ai/v1
Data Training Disabled by default
Prompt Retention Yes (for abuse monitoring)
Status Beta/Experimental

OpenRouter Access

Property Value
OpenRouter Model ID openai/o1-preview
Base URL https://api.langmart.ai/v1

Usage Examples

Basic Chat Completion (OpenAI Direct)

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "o1-preview",
    "messages": [
      {
        "role": "user",
        "content": "Solve this problem step by step: If a train travels at 60 mph for 2 hours, then at 40 mph for 3 hours, what is the average speed for the entire journey?"
      }
    ],
    "max_completion_tokens": 5000
  }'

Via OpenRouter

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "HTTP-Referer: https://your-app.com" \
  -d '{
    "model": "openai/o1-preview",
    "messages": [
      {
        "role": "user",
        "content": "Prove that the square root of 2 is irrational."
      }
    ],
    "max_tokens": 10000
  }'

Python SDK Example

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="o1-preview",
    messages=[
        {
            "role": "user",
            "content": """
            A farmer has a 400-meter fence and wants to enclose a rectangular
            field next to a river (no fence needed on the river side).
            What dimensions maximize the enclosed area?
            """
        }
    ],
    max_completion_tokens=10000
)

print(response.choices[0].message.content)

Complex Coding Problem

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="o1-preview",
    messages=[
        {
            "role": "user",
            "content": """
            Implement a function that finds the longest increasing subsequence
            in an array of integers. Explain your approach and analyze the
            time and space complexity.
            """
        }
    ],
    max_completion_tokens=8000
)

print(response.choices[0].message.content)

Scientific Reasoning

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="o1-preview",
    messages=[
        {
            "role": "user",
            "content": """
            Explain the mechanism by which CRISPR-Cas9 achieves gene editing,
            including the role of guide RNA, PAM sequences, and the repair
            mechanisms that follow DNA cleavage. Include potential off-target
            effects and current strategies to minimize them.
            """
        }
    ],
    max_completion_tokens=15000
)

print(response.choices[0].message.content)

OpenRouter Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.langmart.ai/v1",
    api_key="your-openrouter-key"
)

response = client.chat.completions.create(
    model="openai/o1-preview",
    messages=[
        {
            "role": "user",
            "content": "Derive the Euler-Lagrange equation from first principles."
        }
    ],
    max_tokens=10000,
    extra_headers={
        "HTTP-Referer": "https://your-app.com"
    }
)

print(response.choices[0].message.content)

Model Variants

Variant Model ID Description
o1-preview o1-preview Standard reasoning model (128K context)
o1-preview-2024-09-12 o1-preview-2024-09-12 Specific version snapshot
o1 o1 Full o1 model (200K context, when available)
o1-mini o1-mini Smaller, faster, cheaper reasoning model

Rate Limits

o1-preview is subject to stricter rate limits during its beta phase:

Tier Approximate Limits
Standard Heavily rate-limited
Usage-based Increases with usage

Note: Specific limits may vary. Check OpenAI's documentation for current limits.

Error Handling

Common errors with o1-preview:

Error Cause Solution
system_message_not_supported Using system role Remove system messages
streaming_not_supported Stream parameter set Set stream=false
rate_limit_exceeded Too many requests Implement backoff
context_length_exceeded Input too long Reduce input size
model_not_available Model unavailable Try again later

Additional Resources


Last Updated: December 2024 Source: LangMart API, OpenAI Documentation, and third-party benchmarks