OpenAI o1-preview
Model Overview
| Property |
Value |
| Model ID |
openai/o1-preview |
| Full Name |
OpenAI: o1-preview |
| Provider |
OpenAI |
| Release Date |
September 12, 2024 |
| Type |
Reasoning Language Model (LLM) |
| Architecture |
Transformer-based with Chain-of-Thought Reasoning |
Description
OpenAI o1-preview is a reasoning-focused model designed to "spend more time thinking before responding." It employs chain-of-thought reasoning with self-fact-checking capabilities, making it particularly powerful for complex problem-solving tasks.
Key characteristics:
- Extended Reasoning: Uses internal reasoning tokens to think through problems step-by-step
- PhD-Level Performance: Surpassed human PhD-level performance on the GPQA diamond benchmark
- Math Olympiad: Achieved 83% accuracy on the International Mathematics Olympiad qualifying exam
- Competitive Programming: 89th percentile performance on Codeforces
- STEM Optimization: Specifically designed for math, science, programming, and other STEM-related tasks
Note: This model is currently experimental and not recommended for production use-cases. It may be subject to heavy rate limiting.
Technical Specifications
Context & Token Limits
| Parameter |
Value |
| Context Length |
128,000 tokens |
| Max Completion Tokens |
32,768 tokens |
| Training Data Cutoff |
October 1, 2023 |
| Version |
o1-preview-2024-09-12 |
| Modality |
Support |
| Text Input |
Yes |
| Image Input |
No (beta limitation) |
| File Input |
No |
| Text Output |
Yes |
| Image Output |
No |
| Audio Output |
No |
Reasoning Tokens
The o1-preview model uses hidden "reasoning tokens" internally to process complex problems. These tokens:
- Are consumed from your context window
- Are billed at the same rate as output tokens
- Are not visible in the API response
- OpenAI recommends reserving at least 25,000 tokens for reasoning and outputs
Pricing
Standard Pricing (Per Token)
| Type |
Cost per Token |
Cost per Million Tokens |
| Input |
$0.000015 |
$15.00 |
| Output |
$0.00006 |
$60.00 |
| Reasoning Tokens |
$0.00006 |
$60.00 (same as output) |
Cost Comparison
| Model |
Input (per 1M) |
Output (per 1M) |
| o1-preview |
$15.00 |
$60.00 |
| GPT-4o |
$2.50 |
$10.00 |
| GPT-4o Mini |
$0.15 |
$0.60 |
Note: o1-preview is significantly more expensive than standard models due to its advanced reasoning capabilities.
Capabilities
Core Capabilities
- Advanced Reasoning: Complex logical reasoning through chain-of-thought
- Mathematical Problem-Solving: Olympiad-level math capabilities
- Scientific Analysis: PhD-level performance on physics, chemistry, biology
- Code Generation: Expert-level code analysis and generation
- Self-Verification: Built-in fact-checking during reasoning
Use Cases
- Advanced scientific and mathematical problem-solving
- Expert-level code analysis and debugging
- Critical domain reasoning (biomedicine, law, finance)
- Transparent chain-of-thought applications
- High-stakes decision support systems
- Research and complex analysis tasks
Limitations
- No vision/image processing capability
- No tool calling or function use
- No fine-tuning available
- No streaming support
- Higher latency due to reasoning process
- Significantly higher cost than standard models
Supported Parameters
Available Parameters (Beta)
| Parameter |
Type |
Description |
model |
string |
Model identifier (o1-preview) |
messages |
array |
User and assistant messages only |
max_tokens |
integer |
Maximum tokens to generate (up to 32,768) |
max_completion_tokens |
integer |
Alternative to max_tokens |
Fixed Parameters (Cannot Be Modified)
| Parameter |
Fixed Value |
Notes |
temperature |
1 |
Cannot be adjusted during beta |
top_p |
1 |
Cannot be adjusted during beta |
n |
1 |
Only single completions supported |
presence_penalty |
0 |
Cannot be adjusted during beta |
frequency_penalty |
0 |
Cannot be adjusted during beta |
Not Supported (Beta Limitations)
| Feature |
Status |
| System Messages |
Not supported |
| Streaming |
Not supported |
| Tool/Function Calling |
Not supported |
| Image Inputs |
Not supported |
| Logprobs |
Not supported |
| Stop Sequences |
Not supported |
| Web Search |
Not supported |
Best Practices
When to Use o1-preview
- Complex multi-step mathematical problems
- Scientific reasoning requiring domain expertise
- Advanced coding challenges (algorithms, system design)
- Tasks requiring careful logical analysis
- Problems where accuracy is more important than speed
- Research and academic problem-solving
When to Consider Alternatives
- Simple tasks (use GPT-4o Mini for cost savings)
- Tasks requiring images/vision (use GPT-4o)
- Tasks requiring tool/function calling (use GPT-4o)
- Real-time applications requiring low latency (use GPT-4o Mini)
- Production workloads (o1-preview is experimental)
Optimization Tips
- Reserve sufficient tokens: Leave at least 25,000 tokens for reasoning and output
- Be detailed in prompts: The model benefits from comprehensive problem descriptions
- Avoid system messages: They are not supported; include instructions in user message
- Expect higher latency: The model takes longer to respond due to reasoning
- Budget for costs: Output tokens (including hidden reasoning) are expensive
- Handle errors gracefully: The model may be rate-limited
Prompt Engineering for o1
Since o1-preview cannot use system messages, structure your prompts differently:
# Instead of system message, include instructions in the user prompt
messages = [
{
"role": "user",
"content": """You are an expert mathematician. Please solve the following
problem step by step, showing all your work and explaining your reasoning.
Problem: [Your problem here]
Please provide a complete solution with explanation."""
}
]
| Model |
Comparison |
| o1-mini |
Faster, cheaper, smaller context; good for coding tasks |
| o1 |
Full version with 200K context (if available) |
| GPT-4o |
General purpose, multimodal, supports tools and vision |
| Claude 3.5 Sonnet |
Anthropic competitor with different reasoning approach |
| Gemini 2.0 Flash Thinking |
Google's reasoning model alternative |
Primary Provider: OpenAI
| Property |
Value |
| Base URL |
https://api.langmart.ai/v1 |
| Data Training |
Disabled by default |
| Prompt Retention |
Yes (for abuse monitoring) |
| Status |
Beta/Experimental |
OpenRouter Access
| Property |
Value |
| OpenRouter Model ID |
openai/o1-preview |
| Base URL |
https://api.langmart.ai/v1 |
Usage Examples
Basic Chat Completion (OpenAI Direct)
curl https://api.langmart.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "o1-preview",
"messages": [
{
"role": "user",
"content": "Solve this problem step by step: If a train travels at 60 mph for 2 hours, then at 40 mph for 3 hours, what is the average speed for the entire journey?"
}
],
"max_completion_tokens": 5000
}'
Via OpenRouter
curl https://api.langmart.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LANGMART_API_KEY" \
-H "HTTP-Referer: https://your-app.com" \
-d '{
"model": "openai/o1-preview",
"messages": [
{
"role": "user",
"content": "Prove that the square root of 2 is irrational."
}
],
"max_tokens": 10000
}'
Python SDK Example
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="o1-preview",
messages=[
{
"role": "user",
"content": """
A farmer has a 400-meter fence and wants to enclose a rectangular
field next to a river (no fence needed on the river side).
What dimensions maximize the enclosed area?
"""
}
],
max_completion_tokens=10000
)
print(response.choices[0].message.content)
Complex Coding Problem
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="o1-preview",
messages=[
{
"role": "user",
"content": """
Implement a function that finds the longest increasing subsequence
in an array of integers. Explain your approach and analyze the
time and space complexity.
"""
}
],
max_completion_tokens=8000
)
print(response.choices[0].message.content)
Scientific Reasoning
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="o1-preview",
messages=[
{
"role": "user",
"content": """
Explain the mechanism by which CRISPR-Cas9 achieves gene editing,
including the role of guide RNA, PAM sequences, and the repair
mechanisms that follow DNA cleavage. Include potential off-target
effects and current strategies to minimize them.
"""
}
],
max_completion_tokens=15000
)
print(response.choices[0].message.content)
OpenRouter Python Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.langmart.ai/v1",
api_key="your-openrouter-key"
)
response = client.chat.completions.create(
model="openai/o1-preview",
messages=[
{
"role": "user",
"content": "Derive the Euler-Lagrange equation from first principles."
}
],
max_tokens=10000,
extra_headers={
"HTTP-Referer": "https://your-app.com"
}
)
print(response.choices[0].message.content)
Model Variants
| Variant |
Model ID |
Description |
| o1-preview |
o1-preview |
Standard reasoning model (128K context) |
| o1-preview-2024-09-12 |
o1-preview-2024-09-12 |
Specific version snapshot |
| o1 |
o1 |
Full o1 model (200K context, when available) |
| o1-mini |
o1-mini |
Smaller, faster, cheaper reasoning model |
Rate Limits
o1-preview is subject to stricter rate limits during its beta phase:
| Tier |
Approximate Limits |
| Standard |
Heavily rate-limited |
| Usage-based |
Increases with usage |
Note: Specific limits may vary. Check OpenAI's documentation for current limits.
Error Handling
Common errors with o1-preview:
| Error |
Cause |
Solution |
system_message_not_supported |
Using system role |
Remove system messages |
streaming_not_supported |
Stream parameter set |
Set stream=false |
rate_limit_exceeded |
Too many requests |
Implement backoff |
context_length_exceeded |
Input too long |
Reduce input size |
model_not_available |
Model unavailable |
Try again later |
Additional Resources
Last Updated: December 2024
Source: LangMart API, OpenAI Documentation, and third-party benchmarks