Arcee AI: Spotlight
Overview
Model Name: Arcee AI: Spotlight
Type: Vision-Language Model (7-Billion Parameters)
Base Model: Qwen 2.5-VL
Developer: Arcee AI
Release Date: May 5, 2025
Description: Spotlight is a 7-billion-parameter vision-language model fine-tuned by Arcee AI for tight image-text grounding tasks. The model features a 32k-token context window and emphasizes fast inference on consumer GPUs while retaining strong captioning, visual-question-answering, and diagram-analysis accuracy.
Use Cases:
- Agent workflows with screenshots, charts, and UI mockups
- Image captioning and visual question-answering
- Diagram analysis
- Multimodal conversations combining documents and images
Technical Specifications
| Specification | Value |
|---|---|
| Context Window | 32,000 tokens |
| Parameter Count | 7 billion |
| Context Length | 131,072 tokens |
| Suggested Context | 32,768 tokens |
| Input Modalities | Image, Text |
| Output Modalities | Text |
| Max Completion Tokens | 65,537 |
Pricing
| Metric | Cost |
|---|---|
| Input | $0.18 per 1M tokens |
| Output | $0.18 per 1M tokens |
Supported Parameters
- max_tokens
- temperature
- top_p
- stop
- frequency_penalty
- presence_penalty
- top_k
- repetition_penalty
- logit_bias
- min_p
Related Models
The following resources are available for comparison and usage:
| Resource | URL |
|---|---|
| Chat Interface | https://langmart.ai/chat |
| Model Comparison | https://langmart.ai/model-docs |
Providers
Primary Provider: Together AI
| Detail | Information |
|---|---|
| Provider Name | Together AI |
| Base URL | https://api.langmart.ai/v1 |
| Model ID | arcee_ai/arcee-spotlight |
| Status Page | https://status.together.ai/ |
Data Policy
- Training on User Data: No
- Prompt Retention: No
- Output Publishing Restrictions: Cannot publish outputs
- Terms of Service: https://www.together.ai/terms-of-service
- Privacy Policy: https://www.together.ai/privacy
Performance Metrics
Benchmark Performance:
- Matches or outperforms larger vision-language models like LLaVA-1.6 13B on popular VQA and POPE alignment tests
- Strong performance on image captioning tasks
- Optimized for fast inference on consumer GPUs while maintaining accuracy