Magnum v4 72B

Description

Magnum v4 72B is a fine-tuned version of Qwen2.5 72B that aims to replicate the prose quality of Claude 3 models, specifically Sonnet and Opus. This model is designed for creative writing and roleplay scenarios, offering high-quality text generation with a focus on narrative and conversational abilities.

The model uses the ChatML instruction format and is optimized for generating coherent, engaging prose that matches the stylistic qualities of Anthropic's Claude 3 family.

Technical Specifications

Specification	Value
Context Window	32,768 tokens
Context Length	16,384 tokens
Underlying Model Context	32,768 tokens
Max Completion Tokens	2,048
Input Modalities	Text
Output Modalities	Text
Quantization	FP8

Pricing

Type	Price
Input	$3.00 per 1M tokens
Output	$5.00 per 1M tokens

Input: $0.000003 per token
Output: $0.000005 per token

Supported Parameters

Parameter	Supported
`response_format`	Yes
`max_tokens`	Yes
`temperature`	Yes
`top_p`	Yes
`stop`	Yes
`frequency_penalty`	Yes
`presence_penalty`	Yes
`repetition_penalty`	Yes
`logit_bias`	Yes
`top_k`	Yes
`min_p`	Yes
`seed`	Yes
`top_a`	Yes
`logprobs`	Yes
`top_logprobs`	Yes

Use Cases

Creative Writing: Novel writing, short stories, poetry
Roleplay: Character-driven conversations and scenarios
Dialogue Generation: Natural, engaging conversations
Narrative Content: Blog posts, articles with storytelling elements
Interactive Fiction: Text-based games and adventures

Limitations

No vision/image input support
No tool/function calling support
No reasoning/chain-of-thought capabilities
Context limited to 16,384 tokens (vs 32,768 in base model)
Maximum completion limited to 2,048 tokens

Providers

Provider	Model Variant ID	Notes
Mancer 2	`magnum-72b-v4`	Primary provider, FP8 quantization

Model Information

Property	Value
Model ID	`anthracite-org/magnum-v4-72b`
Name	Magnum v4 72B
Author	anthracite-org
Created	October 22, 2024
Base Model	Qwen2.5 72B
Instruction Format	ChatML

Feature Support

Feature	Supported
Tool/Function Calling	No
Reasoning/Chain-of-Thought	No
Vision/Image Input	No
Structured Output	No

Default Stop Tokens

<|im_start|>
<|im_end|>
<|endoftext|>

Usage Examples

LangMart API (Python)

import requests

response = requests.post(
    "https://api.langmart.ai/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_LANGMART_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "model": "anthracite-org/magnum-v4-72b",
        "messages": [
            {
                "role": "user",
                "content": "Write a short story about a mysterious lighthouse."
            }
        ],
        "max_tokens": 1024,
        "temperature": 0.8
    }
)

print(response.json()["choices"][0]["message"]["content"])

LangMart API (cURL)

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthracite-org/magnum-v4-72b",
    "messages": [
      {
        "role": "user",
        "content": "Write a short story about a mysterious lighthouse."
      }
    ],
    "max_tokens": 1024,
    "temperature": 0.8
  }'

LangMart API (JavaScript/Node.js)

const response = await fetch("https://api.langmart.ai/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_LANGMART_API_KEY",
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "anthracite-org/magnum-v4-72b",
    messages: [
      {
        role: "user",
        content: "Write a short story about a mysterious lighthouse."
      }
    ],
    max_tokens: 1024,
    temperature: 0.8
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

ChatML Format Example

<|im_start|>system
You are a creative writing assistant specialized in crafting engaging narratives.
<|im_end|>
<|im_start|>user
Write a short story about a mysterious lighthouse.
<|im_end|>
<|im_start|>assistant

Recommended Settings

For creative writing and roleplay:

Parameter	Recommended Value
`temperature`	0.7 - 0.9
`top_p`	0.9 - 0.95
`top_k`	40 - 100
`repetition_penalty`	1.1 - 1.15
`max_tokens`	1024 - 2048

For more deterministic outputs:

Parameter	Recommended Value
`temperature`	0.3 - 0.5
`top_p`	0.8
`repetition_penalty`	1.05

Source

Documentation: https://langmart.ai/model-docs
Base Model: Qwen2.5 72B by Alibaba Cloud