This page describes how to use Anthropic models through the AI Gateway. To learn more about the AI Gateway, see AI Gateway.
Quick Start
Access Anthropic’s Claude models through Orq’s unified API with automatic fallbacks, caching, and observability.Available Models
Orq supports all Anthropic Claude models across multiple providers for optimal availability and pricing:Latest Models
| Model | Context | Strengths | Best For |
|---|---|---|---|
claude-opus-4-5-20251101 | 200K | Highest intelligence | Complex reasoning, research |
claude-3-5-sonnet-20241022 | 200K | Best balance | Most tasks, coding |
claude-3-5-haiku-20241022 | 200K | Fast responses | Simple tasks, chat |
Provider Options
Anthropic models are available through multiple providers:anthropic/- Direct Anthropic APIaws/- AWS Bedrock (enterprise features)google/- Google Vertex AI (GCP integration)
Key Features
Prompt Caching
Cache frequently used context (system prompts, documents) to reduce costs by up to 90% and latency by up to 85%. Learn more about Prompt CachingExtended Thinking
Enable deep reasoning for complex problems with budget-based token allocation for internal analysis. Learn more about Extended ThinkingVision Capabilities
All Claude 3+ models support image analysis with high accuracy.Tool Use (Function Calling)
Claude excels at tool use with sophisticated planning and execution.Code Examples
Model Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
max_tokens | number | Maximum tokens to generate (required) | - |
temperature | number | Randomness (0-1) | 1 |
top_p | number | Nucleus sampling (0-1) | - |
top_k | number | Top-K sampling | - |
stop_sequences | string[] | Custom stop sequences | - |
max_tokens is required for Anthropic models. Typical values: 1024 for responses, 4096+ for long content.
Best Practices
Model selection:- Opus 4.5: Complex analysis, research, advanced reasoning
- Sonnet 3.5: Most tasks, coding, general use (best price/performance)
- Haiku 3.5: Simple queries, fast responses, high-volume tasks
Response Structure
Troubleshooting
Missing max_tokens error- Anthropic models require
max_tokensparameter - Add to request:
max_tokens: 1024(or appropriate value)
- Enable prompt caching for repeated context
- Use smaller models (Haiku) for simple tasks
- Monitor token usage and optimize prompts
- Anthropic has tiered rate limits based on usage
- Use Orq’s automatic retries and fallbacks
- Consider AWS/Google providers for higher limits
Limitations
- max_tokens required: Unlike OpenAI, must specify maximum output length
- Rate limits: Vary by tier and provider
- Context window: 200K tokens (may vary by provider)
- System prompts: Handled differently than OpenAI (automatically converted by Orq)