Cohere: Command R+

Inference Model ID: cohere/command-r-plus

Overview

Property	Value
Provider	Cohere
Model ID	`cohere/command-r-plus`
Short Name	Command R+
Created	April 4, 2024
Parameters	104 billion
Context Length	128,000 tokens
Max Completion Tokens	4,096
Input Modalities	Text
Output Modalities	Text
Architecture	Auto-regressive transformer with optimized design
Training	SFT + Preference training aligned to human preferences

Description

Command R+ is a 104B-parameter large language model from Cohere, purpose-built for enterprise applications. It excels at roleplay, general consumer use cases, and Retrieval Augmented Generation (RAG). The model features multilingual support for ten key languages to facilitate global business operations.

Key characteristics:

Open Weights: Publicly available via HuggingFace (CohereForAI/c4ai-command-r-plus)
RAG Optimized: State-of-the-art retrieval-augmented generation with grounded citations
Multilingual Excellence: Strong performance across 10 primary + 13 secondary languages
Tool Use: Single-step and multi-step (agentic) tool calling capabilities
Enterprise Focus: Designed for enterprise-grade workloads with safety alignment

Pricing

Type	Price per Million
Input Tokens	$2.50
Output Tokens	$10.00

Capabilities

1. Grounded Generation & RAG

Generates responses with citation spans from provided documents
Supports "accurate" and "fast" citation modes
Processes document chunks (100-400 words typical)
Document format: key-value pairs with title/text structure

2. Single-Step Tool Use (Function Calling)

JSON-formatted action generation
Multi-tool support with parameter specification
Special directly_answer tool for abstention
Two-inference model: Tool Selection -> Response Generation

3. Multi-Step Tool Use (Agents)

Iterative Action -> Observation -> Reflection cycles
Multi-hop reasoning capabilities
Sequential tool orchestration

4. Code Capabilities

Code snippet interaction
Code explanations and rewrites
Optimized with low temperature for code generation
Not optimized for pure code completion

Supported Parameters

Parameter	Description
`max_tokens`	Maximum number of tokens to generate
`temperature`	Controls randomness in output generation (recommended: 0.3 for code)
`top_p`	Nucleus sampling probability threshold
`top_k`	Top-K sampling parameter
`stop`	Stop sequences to end generation
`frequency_penalty`	Penalty for token frequency
`presence_penalty`	Penalty for token presence
`seed`	Seed for reproducible outputs
`response_format`	Format specification for the response
`structured_outputs`	Enable structured output generation

Features

Feature	Supported
Tool Choice	Yes (none, auto, required, function)
Reasoning	No
Chat Completions	Yes
Completions Endpoint	No
Multipart Support	Yes
Grounded Generation	Yes
RAG with Citations	Yes

Tool Choice Options

literal_none - Disable tool use
literal_auto - Let model decide tool usage
literal_required - Force tool usage
type_function - Specific function tool selection

Use Cases

Retrieval Augmented Generation: Enterprise search, document Q&A with citations
Agentic Workflows: Complex multi-step tasks with tool usage
Multilingual Applications: Global customer service, translation, content generation
Roleplay & Creative: Conversational AI, character simulation
Long Document Processing: Analysis of lengthy documents, contracts, research papers
Enterprise Applications: Business-critical tasks requiring reliable performance

API Usage Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cohere/command-r-plus",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "max_tokens": 1024,
    "temperature": 0.3
  }'

Using with RAG/Grounded Generation

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cohere/command-r-plus",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant that answers questions based on provided documents."
      },
      {
        "role": "user",
        "content": "Based on the following document, answer my question.\n\nDocument: {\"title\": \"Company Policy\", \"text\": \"All employees are entitled to 20 days of paid vacation per year.\"}\n\nQuestion: How many vacation days do employees get?"
      }
    ],
    "max_tokens": 512
  }'

Using with Tools

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cohere/command-r-plus",
    "messages": [
      {"role": "user", "content": "What is the weather in San Francisco?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Python Usage with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

gen_tokens = model.generate(
    input_ids,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.3
)

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)

Command R+ Variants

Model ID	Description
`cohere/command-r-plus-08-2024`	Updated version with 50% higher throughput, 25% lower latency

Cohere Model Family

Model ID	Description
`cohere/command-r`	Command R - Smaller 35B parameter model
`cohere/command-r-08-2024`	Command R (August 2024 release)
`cohere/command-a`	Command A - Latest 111B model with 256K context

Similar Enterprise Models

Model ID	Description
`meta-llama/llama-3.3-70b-instruct`	Llama 3.3 70B - Open weights
`mistralai/mixtral-8x22b-instruct`	Mixtral 8x22B - Open weights MoE
`qwen/qwen-2.5-72b-instruct`	Qwen 2.5 72B - Open weights

Providers

Primary Provider: Cohere

Property	Value
Provider	Cohere
Provider Base URL	`https://api.langmart.ai/v1`
Data Policy	Training disabled
Prompt Retention	30 days
Publication Allowed	No

Supported Languages

Primary Languages (10 Optimized)

English
French
Spanish
Italian
German
Brazilian Portuguese
Japanese
Korean
Arabic
Simplified Chinese

Secondary Languages (13 in Pre-training)

Russian
Polish
Turkish
Vietnamese
Dutch
Czech
Indonesian
Ukrainian
Romanian
Greek
Hindi
Hebrew
Persian

Performance Benchmarks

Open LLM Leaderboard Scores

Benchmark	Score
Average	74.6
Arc (Challenge)	70.99
HellaSwag	88.6
MMLU	75.7
Truthful QA	56.3
Winogrande	85.4
GSM8k	70.7

Outperforms: DBRX Instruct (74.5), Mixtral 8x7B (72.7)

Model Weights

The model weights are publicly available:

HuggingFace Repository: CohereForAI/c4ai-command-r-plus
Quantized Version: CohereForAI/c4ai-command-r-plus-4bit
License: CC-BY-NC with Acceptable Use Policy

Quantization Options

8-bit precision (via BitsAndBytes)
4-bit precision (separate quantized version available)

Notes

This model is part of Cohere's Command family released in April 2024
With 128K context length, it supports extremely long documents and conversations
Open weights enable self-hosting and fine-tuning (non-commercial use)
Optimized for RAG with built-in citation capabilities
Excellent multilingual support for global applications
Strong tool calling for agentic workflows
An updated version (08-2024) offers 50% higher throughput and 25% lower latency

Usage Policy

This model is subject to:

Cohere Usage Policy
Cohere SaaS Agreement
CC-BY-NC License (non-commercial use for self-hosted weights)

Source: LangMart Model Registry HuggingFace: CohereForAI/c4ai-command-r-plus Last Updated: December 23, 2025

Cohere: Command R+

Cohere: Command R+

Overview

Description

Pricing

Capabilities

1. Grounded Generation & RAG

2. Single-Step Tool Use (Function Calling)

3. Multi-Step Tool Use (Agents)

4. Code Capabilities

Supported Parameters

Features

Tool Choice Options

Use Cases

API Usage Example

Using with RAG/Grounded Generation

Using with Tools

Python Usage with Transformers

Related Models

Command R+ Variants

Cohere Model Family

Similar Enterprise Models

Providers

Primary Provider: Cohere

Supported Languages

Primary Languages (10 Optimized)

Secondary Languages (13 in Pre-training)

Performance Benchmarks

Open LLM Leaderboard Scores

Model Weights

Quantization Options

Notes

Usage Policy