A

Goliath 120B Model Documentation

Alpindale
Vision
6K
Context
$6.00
Input /1M
$8.00
Output /1M
1K
Max Output

Goliath 120B Model Documentation

Overview

Model Name: Goliath 120B Creator: alpindale Inference Model ID: alpindale/goliath-120b Provider: Mancer (via LangMart) Release Date: November 10, 2023

Description

Goliath 120B is a merged model that combines "two fine-tuned Llama 70B models into one 120B model" by merging Xwin and Euryale variants. The model was created using the mergekit framework by @chargoddard, with merge ratio optimization by @Undi95.

This model represents an advanced approach to model merging, leveraging the strengths of both fine-tuned variants to create a more capable 120B parameter model.

Technical Specifications

Model Architecture

  • Model Group: Llama2
  • Base Model: Llama 70B (merged variant)
  • Total Parameters: 120 Billion
  • Context Window: 6,144 tokens
  • Instruction Format: Airoboros

Input/Output Capabilities

  • Input Modalities: Text only
  • Output Modalities: Text only
  • Max Completion Tokens: 1,024 per request
  • Default Stop Sequences: USER:, </s>

Pricing

Metric Cost
Context Window 6,144 tokens
Input Token Cost $6/Million tokens
Output Token Cost $8/Million tokens
Max Completion Tokens 1,024 per request

Provider: Mancer 2

Cost Calculation Example

  • Request: 100 input tokens + 500 output tokens
  • Input cost: 100 × ($6/1M) = $0.0006
  • Output cost: 500 × ($8/1M) = $0.004
  • Total: $0.0046

Capabilities

Feature Supported
Tool Use No
Reasoning No
Vision No
Function Calling No

Supported Parameters

The model supports a comprehensive set of parameters for fine-grained control:

Parameter Category Supported Options
Response Format JSON mode, text
Token Limits max_tokens, min_tokens
Sampling temperature, top_p, top_k, top_a, min_p
Penalties frequency_penalty, presence_penalty, repetition_penalty
Control stop sequences, logit_bias
Advanced seed (for reproducibility), logprobs

Parameter Details

  • temperature: Controls randomness (0.0 = deterministic, 2.0 = high randomness)
  • top_p: Nucleus sampling parameter (0.0-1.0)
  • top_k: Restricts sampling to top K tokens
  • top_a: Threshold for token amplitude
  • min_p: Minimum probability threshold
  • frequency_penalty: Reduces token repetition based on frequency
  • presence_penalty: Reduces token repetition based on presence
  • repetition_penalty: Alternative token repetition control
  • logit_bias: Adjusts logit values for specific tokens
  • seed: Ensures reproducible outputs
  • logprobs: Returns log probabilities of generated tokens

Use Cases

Goliath 120B is suitable for:

  • Long-form text generation
  • Advanced reasoning tasks
  • Complex dialogue and instruction following
  • Content creation and analysis
  • Fine-tuned inference applications
  • Creative writing and storytelling
  • Code generation and technical explanations
  • Document summarization and analysis

Limitations

  • Context Window: 6,144 tokens is limited compared to modern models (32K-200K+)
  • Modalities: Text-only (no vision, audio, or multimodal capabilities)
  • Advanced Features: No function calling, tool use, or extended reasoning support
  • Training Data: Based on Llama 2, may have knowledge cutoff limitations
  • Xwin-LM: One of the base models used in the merge (Llama 70B variant)
  • Euryale: The second base model used in the merge (Llama 70B variant)
  • Llama 2 70B: Base architecture for both merged models
  • Goliath (Other Variants): Alternative Goliath model configurations

Development Credits

  • Merge Framework: @chargoddard (mergekit)
  • Merge Optimization: @Undi95 (ratio optimization)
  • Creator: alpindale

Provider Information

Mancer 2 Details

  • Provider Name: Mancer 2
  • Hosting Endpoint: neuro.mancer.tech/oai/v1
  • Data Retention Policy: No data retention for training purposes
  • Privacy: Terms of service available at mancer.tech
  • Hosting: Dedicated hosting for optimal performance

Integration & Access

LangMart Integration The model is available through LangMart for:

  • Chat completions interface
  • Model comparison tools
  • Batch processing

Model ID: alpindale/goliath-120b

Model Weights: Publicly available on Hugging Face for community use and local deployment

Usage

API Request Example

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer $LANGMART_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "alpindale/goliath-120b",
    "messages": [
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ]
  }'

Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.langmart.ai/v1",
    api_key="YOUR_LANGMART_API_KEY"
)

response = client.chat.completions.create(
    model="alpindale/goliath-120b",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ]
)

print(response.choices[0].message.content)

Performance Characteristics

  • Size: 120 Billion parameters
  • Context: 6,144 token context window (suitable for standard conversations and documents)
  • Inference Speed: Optimized by Mancer for fast inference
  • Quality: Result of advanced merging of fine-tuned Llama models
  • Instruction Following: Airoboros format specialized instruction handling

Model Comparison

Compared to other models available on LangMart:

  • vs Claude 3.5 Sonnet: Less capable but significantly cheaper, text-only
  • vs GPT-4: Stronger for creative tasks, weaker for complex reasoning
  • vs Llama 2 70B: Combined strengths through model merging
  • vs Llama 3: Older architecture but still effective

Integration Guide

Using with LangMart

const response = await client.post('/v1/chat/completions', {
  model: 'alpindale/goliath-120b',
  messages: [
    { role: 'user', content: 'Your prompt here' }
  ],
  temperature: 0.7,
  max_tokens: 1000
});

Environment Variables

LANGMART_API_KEY=your_api_key_here
LANGMART_MODEL_ID=alpindale/goliath-120b

References

  • Creator: alpindale
  • Mergekit Framework: Created by @chargoddard
  • Merge Optimization: @Undi95
  • Hosting Provider: Mancer 2
  • Platform: LangMart.ai
  • Repository: Hugging Face (model weights available)

Last Updated: December 23, 2025 Source: LangMart Model Registry Model Card: Available on LangMart and Hugging Face