R

Reka Flash

Reka
Vision
32K
Context
$0.0800
Input /1M
$0.1200
Output /1M
8K
Max Output

Reka Flash

Model ID: reka/reka-flash
Provider: Reka - Reka AI
Canonical Slug: reka/reka-flash

Overview

Reka Flash is a fast, efficient multimodal model from Reka AI, designed for quick inference on both text and image understanding tasks. It offers good performance with reduced latency.

Specifications

Specification Value
Context Window 32,000 tokens
Context Window 128,000 tokens
Max Output Tokens 8,192
Modality text+image->text
Model Architecture text+image to text
Release Date 1712000000

Pricing

Metric Price
Prompt Cost $0.08 per 1M tokens
Completion Cost $0.12 per 1M tokens
Currency USD

Capabilities

  • Vision / Multimodal
  • Text Generation

Supported Parameters

The model supports the following parameters in API requests:

  • temperature: Controls randomness (0.0 - 2.0), default: 1.0
  • top_p: Nucleus sampling (0.0 - 1.0), default: 1.0
  • top_k: Top-k filtering
  • frequency_penalty: Reduces repetition (-2.0 to 2.0)
  • presence_penalty: Encourages new topics (-2.0 to 2.0)
  • repetition_penalty: Alternative repetition control (0.5 - 2.0)
  • stop: Stop sequences
  • seed: Random seed for reproducibility
  • max_tokens: Maximum output length

API Usage Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "reka/reka-flash",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum computing in simple terms"
      }
    ],
    "temperature": 1.0,
    "max_tokens": 8192,
    "top_p": 1.0
  }'

Performance Metrics

Speed & Quality Tradeoff

  • Inference Speed: Fast
  • Quality Tier: Advanced
  • Cost Efficiency: Optimized for production
  • Long-form text generation
  • Code generation and analysis
  • Conversational AI
  • Complex reasoning tasks
  • Information synthesis

From Same Provider

Comparable Models from Other Providers

  • OpenAI: GPT-4 Turbo, GPT-4o
  • Anthropic: Claude 3.5 Sonnet
  • Google: Gemini 2.0 Flash
  • DeepSeek: DeepSeek-R1

Python Integration

import anthropic

client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://api.langmart.ai/v1"
)

message = client.messages.create(
    model="reka/reka-flash",
    max_tokens=8192,
    messages=[
        {
            "role": "user",
            "content": "Your prompt here"
        }
    ]
)

print(message.content[0].text)

JavaScript/Node.js Integration

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.LANGMART_API_KEY,
  baseURL: "https://api.langmart.ai/v1",
});

const completion = await openai.chat.completions.create({
  model: "reka/reka-flash",
  messages: [
    {
      role: "user",
      content: "Your prompt here",
    },
  ],
  max_tokens: 8192,
});

console.log(completion.choices[0].message.content);

Performance Notes

Strengths

  • Efficient inference with good quality
  • Well-suited for production workloads
  • Strong instruction-following ability
  • Balanced performance and cost

Considerations

  • Context length may be limited for very long documents
  • Specialized for specific tasks

Additional Information

  • Hugging Face Model: Not available
  • License: Open or Commercial (depends on provider)
  • Streaming: Supported
  • Function Calling: Depends on model configuration
  • Vision Capabilities: Yes
  • Web Search: No

Availability & Status

  • LangMart Status: Available
  • Rate Limits: Standard LangMart limits apply
  • SLA: Subject to provider availability

Documentation Generated: 2025-12-24
Source: LangMart API & Public Documentation
Last Updated: December 2025