Skip to main content
This page describes how to use Anthropic models through the AI Gateway. To learn more about the AI Gateway, see AI Gateway.

Quick Start

Access Anthropic’s Claude models through Orq’s unified API with automatic fallbacks, caching, and observability.
const response = await openai.chat.completions.create({
  model: "anthropic/claude-sonnet-4-5-20250929",
  messages: [
    {
      role: "user",
      content: "Explain quantum computing in simple terms",
    },
  ],
  max_tokens: 1024,
});

Available Models

Orq supports all Anthropic Claude models across multiple providers for optimal availability and pricing:

Latest Models

ModelContextStrengthsBest For
claude-opus-4-5-20251101200KHighest intelligenceComplex reasoning, research
claude-3-5-sonnet-20241022200KBest balanceMost tasks, coding
claude-3-5-haiku-20241022200KFast responsesSimple tasks, chat

Provider Options

Anthropic models are available through multiple providers:
  • anthropic/ - Direct Anthropic API
  • aws/ - AWS Bedrock (enterprise features)
  • google/ - Google Vertex AI (GCP integration)
// Direct Anthropic
model: "anthropic/claude-sonnet-4-5-20250929"

// AWS Bedrock
model: "aws/anthropic/claude-sonnet-4-5-20250929"

// Google Vertex AI
model: "google/anthropic/claude-opus-4-5-20251101"

Key Features

Prompt Caching

Cache frequently used context (system prompts, documents) to reduce costs by up to 90% and latency by up to 85%. Learn more about Prompt Caching

Extended Thinking

Enable deep reasoning for complex problems with budget-based token allocation for internal analysis. Learn more about Extended Thinking

Vision Capabilities

All Claude 3+ models support image analysis with high accuracy.
const response = await openai.chat.completions.create({
  model: "anthropic/claude-sonnet-4-5-20250929",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        {
          type: "image_url",
          image_url: { url: "https://example.com/image.jpg" }
        },
      ],
    },
  ],
});

Tool Use (Function Calling)

Claude excels at tool use with sophisticated planning and execution.
const response = await openai.chat.completions.create({
  model: "anthropic/claude-sonnet-4-5-20250929",
  messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" },
          },
          required: ["location"],
        },
      },
    },
  ],
});

Code Examples

curl -X POST https://api.orq.ai/v2/proxy/chat/completions \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-5-20250929",
    "messages": [
      {
        "role": "user",
        "content": "Write a Python function to calculate Fibonacci numbers"
      }
    ],
    "max_tokens": 1024
  }'

Model Parameters

ParameterTypeDescriptionDefault
max_tokensnumberMaximum tokens to generate (required)-
temperaturenumberRandomness (0-1)1
top_pnumberNucleus sampling (0-1)-
top_knumberTop-K sampling-
stop_sequencesstring[]Custom stop sequences-
Note: max_tokens is required for Anthropic models. Typical values: 1024 for responses, 4096+ for long content.

Best Practices

Model selection:
  • Opus 4.5: Complex analysis, research, advanced reasoning
  • Sonnet 3.5: Most tasks, coding, general use (best price/performance)
  • Haiku 3.5: Simple queries, fast responses, high-volume tasks
Token management:
// Set appropriate max_tokens based on task
const getMaxTokens = (taskType: string) => {
  const limits = {
    chat: 1024,
    summary: 500,
    generation: 4096,
    analysis: 2048,
  };
  return limits[taskType] || 1024;
};
Multi-provider strategy:
// Use Orq's fallback system for reliability
const response = await openai.chat.completions.create({
  model: "anthropic/claude-sonnet-4-5-20250929",
  messages: [{ role: "user", content: "..." }],
  orq: {
    fallbacks: [
      { model: "aws/anthropic/claude-sonnet-4-5-20250929" },
      { model: "anthropic/claude-opus-4-5-20251101" },
    ],
  },
});

Response Structure

{
  "id": "msg_01ABC123",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "anthropic/claude-sonnet-4-5-20250929",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response content here"
      },
      "finish_reason": "end_turn"
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 250,
    "total_tokens": 350
  }
}

Troubleshooting

Missing max_tokens error
  • Anthropic models require max_tokens parameter
  • Add to request: max_tokens: 1024 (or appropriate value)
High costs
  • Enable prompt caching for repeated context
  • Use smaller models (Haiku) for simple tasks
  • Monitor token usage and optimize prompts
Rate limits
  • Anthropic has tiered rate limits based on usage
  • Use Orq’s automatic retries and fallbacks
  • Consider AWS/Google providers for higher limits

Limitations

  • max_tokens required: Unlike OpenAI, must specify maximum output length
  • Rate limits: Vary by tier and provider
  • Context window: 200K tokens (may vary by provider)
  • System prompts: Handled differently than OpenAI (automatically converted by Orq)

Advanced Features

Streaming

const stream = await openai.chat.completions.create({
  model: "anthropic/claude-sonnet-4-5-20250929",
  messages: [{ role: "user", content: "Tell me a story" }],
  max_tokens: 2048,
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

PDF Input

Claude Opus 4.5 supports direct PDF analysis:
const response = await openai.chat.completions.create({
  model: "anthropic/claude-opus-4-5-20251101",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Summarize this document" },
        {
          type: "document",
          document: {
            type: "pdf",
            url: "https://example.com/document.pdf"
          }
        },
      ],
    },
  ],
  max_tokens: 2048,
});

Reference