> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Anthropic Claude integration

> Access Claude models through Orq.ai. Use Claude 4.6 Opus, Sonnet, and Claude 4.5 Haiku with enhanced routing, caching, and prompt management capabilities.

## Setup Your API Key

To use Anthropic with Orq.ai, follow these steps:

1. Navigate to **AI Router** > Providers
2. Find **Anthropic** in the list
3. Click the **Configure** button next to Anthropic
4. In the modal that opens, select **Setup your own API Key**
5. Enter a name for this configuration (e.g., "Anthropic Production")
6. Paste your Anthropic API Key into the provided field
7. Click **Save** to complete the setup

Your Anthropic API key is now configured and ready to use with Orq.ai in **AI Studio** or through the **AI Router**.

## Quick Start

Access Anthropic's Claude models through the **AI Router**.

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [
      {
        role: "user",
        content: "Explain quantum computing in simple terms",
      },
    ],
    max_tokens: 1024,
  });
  ```
</CodeGroup>

## Available Models

Orq supports all Anthropic Claude models across multiple providers for optimal availability and pricing:

### Latest Models

| Model                       | Context | Strengths                                    | Best For                                 |
| --------------------------- | ------- | -------------------------------------------- | ---------------------------------------- |
| `claude-opus-4-7`           | 1M      | Highest intelligence, xhigh reasoning effort | Coding, agentic tasks, complex reasoning |
| `claude-opus-4-6`           | 200K    | High intelligence                            | Complex reasoning, research              |
| `claude-sonnet-4-6`         | 200K    | Best balance                                 | Most tasks, coding                       |
| `claude-haiku-4-5-20251001` | 200K    | Fast responses                               | Simple tasks, chat                       |

### Provider Options

Anthropic models are available through multiple providers:

* **`anthropic/`** - Direct Anthropic API
* **`aws/`** - AWS Bedrock (enterprise features)
* **`google/`** - Google Vertex AI (GCP integration)

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Direct Anthropic
  model: "anthropic/claude-sonnet-4-6"

  // AWS Bedrock
  model: "aws/anthropic/claude-sonnet-4-6"

  // Google Vertex AI
  model: "google/anthropic/claude-opus-4-6"
  ```
</CodeGroup>

For a complete list of supported models, see [Supported Models](/docs/proxy/supported-models).

## Using the AI Router

Access Claude models (Claude 4.6 Opus, Sonnet, and Claude 4.5 Haiku) through the **AI Router** with advanced message APIs, tool use capabilities, and intelligent model routing. All Claude models are available with consistent formatting and pricing across multiple providers.

<Info>
  Claude models use the provider slug format: `anthropic/model-name`. For example: `anthropic/claude-sonnet-4-6`
</Info>

### Prerequisites

Before making requests to the **AI Router**, you need to configure your environment and install the SDKs if you choose to use them.

**Endpoint**

```
POST https://api.orq.ai/v3/router/chat/completions
```

**Required Headers**

Include the following headers in all requests:

```
Authorization: Bearer $ORQ_API_KEY
Content-Type: application/json
```

**Getting your API Key:**

1. Go to [API Keys](/docs/router/api-keys)
2. Click **Create API Key** and copy it
3. Store it in your environment as `ORQ_API_KEY`

**SDK Installation**

Install the OpenAI SDK for your language:

<CodeGroup>
  ```bash Node.js/TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  npm install openai
  # or
  yarn add openai
  ```

  ```bash Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  pip install openai
  ```
</CodeGroup>

### Basic Usage

<Tip>
  If your OpenAI code is already functionning, you only need to change the `base_url` and `api_key` to the router endpoint and `ORQ_API_KEY`.
</Tip>

#### Chat Completion

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "anthropic/claude-sonnet-4-6",
      "messages": [
        {
          "role": "user",
          "content": "Explain quantum computing in simple terms"
        }
      ],
      "max_tokens": 1024
    }'
  ```

  ```python Python (OpenAI SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  openai = OpenAI(
      api_key=os.environ.get('ORQ_API_KEY'),
      base_url='https://api.orq.ai/v3/router'
  )

  response = openai.chat.completions.create(
      model='anthropic/claude-sonnet-4-6',
      messages=[
          {
              'role': 'user',
              'content': 'Explain quantum computing in simple terms'
          }
      ],
      max_tokens=1024
  )

  print(response.choices[0].message.content)
  ```

  ```typescript NodeJS/TypeScript (OpenAI SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [
      {
        role: "user",
        content: "Explain quantum computing in simple terms",
      },
    ],
    max_tokens: 1024,
  });

  console.log(response.choices[0].message.content);
  ```

  ```python Python (Anthropic SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from anthropic import Anthropic
  import os

  client = Anthropic(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router"
  )

  message = client.messages.create(
      model="anthropic/claude-sonnet-4-6",
      max_tokens=1024,
      messages=[
          {
              "role": "user",
              "content": "Explain quantum computing in simple terms"
          }
      ]
  )

  print(message.content[0].text)
  ```

  ```typescript NodeJS (Anthropic SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import Anthropic from "@anthropic-ai/sdk";

  const client = new Anthropic({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const message = await client.messages.create({
    model: "anthropic/claude-sonnet-4-6",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: "Explain quantum computing in simple terms",
      },
    ],
  });

  console.log(message.content[0].text);
  ```
</CodeGroup>

#### Streaming

Stream responses for real-time output instead of waiting for the complete response:

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const stream = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [{ role: "user", content: "Tell me a story" }],
    max_tokens: 2048,
    stream: true,
  });

  for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || "");
  }
  ```
</CodeGroup>

### Advanced Usage

#### Prompt Caching

<Note>
  Ensure that Prompt Caching is compatible with the chosen model.
</Note>

For a full guide, see [Prompt Caching](/docs/proxy/prompt-caching).

Cache frequently used context (system prompts, large documents, code bases) to reduce costs by up to 90% and latency by up to 85%.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "anthropic/claude-sonnet-4-6",
      "messages": [
        {
          "role": "system",
          "content": [
            {
              "type": "text",
              "text": "You are an expert Python developer with deep knowledge of best practices.",
              "cache_control": { "type": "ephemeral" }
            }
          ]
        },
        {
          "role": "user",
          "content": "Write a function to parse JSON"
        }
      ],
      "max_tokens": 1024
    }'
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  openai = OpenAI(
      api_key=os.environ.get('ORQ_API_KEY'),
      base_url='https://api.orq.ai/v3/router'
  )

  response = openai.chat.completions.create(
      model='anthropic/claude-sonnet-4-6',
      messages=[
          {
              'role': 'system',
              'content': [
                  {
                      'type': 'text',
                      'text': 'You are an expert Python developer with deep knowledge of best practices.',
                      'cache_control': {'type': 'ephemeral'}
                  }
              ]
          },
          {
              'role': 'user',
              'content': 'Write a function to parse JSON'
          }
      ],
      max_tokens=1024
  )
  ```

  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [
      {
        role: "system",
        content: [
          {
            type: "text",
            text: "You are an expert Python developer with deep knowledge of best practices.",
            cache_control: { type: "ephemeral" },
          },
        ],
      },
      {
        role: "user",
        content: "Write a function to parse JSON",
      },
    ],
    max_tokens: 1024,
  });
  ```
</CodeGroup>

**How It Works**

Prompt caching stores frequently used content blocks on Anthropic's servers for reuse across requests:

1. **Mark content for caching**: Add `cache_control: { type: "ephemeral" }` to text blocks
2. **First request**: Content is processed normally and cached (cache write)
3. **Subsequent requests**: Cached content is reused (cache read)
4. **Cache lifetime**: 5 minutes from last use (automatically managed)

**Configuration**

Mark content blocks for caching by adding the `cache_control` parameter:

| Parameter | Type             | Required | Description                      |
| --------- | ---------------- | -------- | -------------------------------- |
| `type`    | `"ephemeral"`    | Yes      | Only supported cache type        |
| `ttl`     | `"5m"` \| `"1h"` | No       | Cache duration (default: `"5m"`) |

**Cache TTL Options**

The `ttl` parameter controls how long cached content persists:

* `"5m"` (5 minutes) - Default cache duration
* `"1h"` (1 hour) - Extended cache duration for longer-running workflows

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
  "cache_control": {
    "type": "ephemeral",
    "ttl": "1h"
  }
}
```

**Cache placement rules**

* Add `cache_control` to the **last** message or content block you want cached
* Everything up to that point is included in the cache
* Maximum: 4 cache breakpoints per request

**Minimum token thresholds**

Caching only activates once the marked content meets the model's minimum. Requests below the threshold are processed normally at full cost.

| Model                                                     | Minimum tokens |
| --------------------------------------------------------- | -------------- |
| Claude Opus 4.6, Opus 4.5                                 | 4,096          |
| Claude Sonnet 4.6                                         | 2,048          |
| Claude Sonnet 4.5, Opus 4.1, Opus 4, Sonnet 4, Sonnet 3.7 | 1,024          |
| Claude Haiku 4.5                                          | 4,096          |
| Claude Haiku 3.5, Haiku 3                                 | 2,048          |

**Use Cases**

<AccordionGroup>
  <Accordion title="Static System Prompts">
    Cache role definitions and instructions that don't change.

    <CodeGroup>
      ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
      curl -X POST https://api.orq.ai/v3/router/chat/completions \
        -H "Authorization: Bearer $ORQ_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "anthropic/claude-sonnet-4-6",
          "messages": [
            {
              "role": "system",
              "content": [
                {
                  "type": "text",
                  "text": "You are an expert software engineer specializing in Python.\nYour responses should be:\n- Clear and concise\n- Include code examples\n- Follow PEP 8 style guidelines\n- Include error handling",
                  "cache_control": { "type": "ephemeral" }
                }
              ]
            },
            {
              "role": "user",
              "content": "How do I read a CSV file?"
            }
          ],
          "max_tokens": 1024
        }'
      ```

      ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
      from openai import OpenAI
      import os

      openai = OpenAI(
          api_key=os.environ.get('ORQ_API_KEY'),
          base_url='https://api.orq.ai/v3/router'
      )

      system_prompt = """You are an expert software engineer specializing in Python.
      Your responses should be:
      - Clear and concise
      - Include code examples
      - Follow PEP 8 style guidelines
      - Include error handling"""

      response = openai.chat.completions.create(
          model='anthropic/claude-sonnet-4-6',
          messages=[
              {
                  'role': 'system',
                  'content': [
                      {
                          'type': 'text',
                          'text': system_prompt,
                          'cache_control': {'type': 'ephemeral'}
                      }
                  ]
              },
              {
                  'role': 'user',
                  'content': 'How do I read a CSV file?'
              }
          ],
          max_tokens=1024
      )
      ```

      ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
      const response = await openai.chat.completions.create({
        model: 'anthropic/claude-sonnet-4-6',
        messages: [
          {
            role: 'system',
            content: [
              {
                type: 'text',
                text: `You are an expert software engineer specializing in Python.
      Your responses should be:
      - Clear and concise
      - Include code examples
      - Follow PEP 8 style guidelines
      - Include error handling`,
                cache_control: { type: 'ephemeral' },
              },
            ],
          },
          {
            role: 'user',
            content: 'How do I read a CSV file?',
          },
        ],
        max_tokens: 1024,
      });
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="Large Document Context">
    Cache documents, codebases, or knowledge bases for reuse across multiple queries.

    <CodeGroup>
      ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
      curl -X POST https://api.orq.ai/v3/router/chat/completions \
        -H "Authorization: Bearer $ORQ_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "anthropic/claude-sonnet-4-6",
          "messages": [
            {
              "role": "user",
              "content": [
                {
                  "type": "text",
                  "text": "Here is our API documentation:\n\n[Large documentation content here...]",
                  "cache_control": { "type": "ephemeral" }
                },
                {
                  "type": "text",
                  "text": "How do I authenticate with the API?"
                }
              ]
            }
          ],
          "max_tokens": 1024
        }'
      ```

      ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
      from openai import OpenAI
      import os

      openai = OpenAI(
          api_key=os.environ.get('ORQ_API_KEY'),
          base_url='https://api.orq.ai/v3/router'
      )

      # Load your API documentation
      api_docs = load_api_documentation()

      response = openai.chat.completions.create(
          model='anthropic/claude-sonnet-4-6',
          messages=[
              {
                  'role': 'user',
                  'content': [
                      {
                          'type': 'text',
                          'text': f'Here is our API documentation:\n\n{api_docs}',
                          'cache_control': {'type': 'ephemeral'}
                      },
                      {
                          'type': 'text',
                          'text': 'How do I authenticate with the API?'
                      }
                  ]
              }
          ],
          max_tokens=1024
      )
      ```

      ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
      const response = await openai.chat.completions.create({
        model: 'anthropic/claude-sonnet-4-6',
        messages: [
          {
            role: 'user',
            content: [
              {
                type: 'text',
                text: 'Here is our API documentation:\n\n' + apiDocs,
                cache_control: { type: 'ephemeral' },
              },
              {
                type: 'text',
                text: 'How do I authenticate with the API?',
              },
            ],
          },
        ],
        max_tokens: 1024,
      });
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="Multi-turn Conversations">
    Cache conversation history for long interactions to reduce processing time and costs on subsequent messages.

    <CodeGroup>
      ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
      curl -X POST https://api.orq.ai/v3/router/chat/completions \
        -H "Authorization: Bearer $ORQ_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "anthropic/claude-sonnet-4-6",
          "messages": [
            {
              "role": "user",
              "content": "What is Python?"
            },
            {
              "role": "assistant",
              "content": "Python is a high-level programming language..."
            },
            {
              "role": "user",
              "content": [
                {
                  "type": "text",
                  "text": "What are its main features?",
                  "cache_control": { "type": "ephemeral" }
                }
              ]
            },
            {
              "role": "assistant",
              "content": "Pythons main features include..."
            },
            {
              "role": "user",
              "content": "Can you give me a code example?"
            }
          ],
          "max_tokens": 1024
        }'
      ```

      ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
      from openai import OpenAI
      import os

      openai = OpenAI(
          api_key=os.environ.get('ORQ_API_KEY'),
          base_url='https://api.orq.ai/v3/router'
      )

      conversation_history = [
          {'role': 'user', 'content': 'What is Python?'},
          {'role': 'assistant', 'content': 'Python is a high-level...'},
          {'role': 'user', 'content': 'What are its main features?'},
          {'role': 'assistant', 'content': 'Python\'s main features include...'},
      ]

      # Mark last history message for caching
      last_message = conversation_history[-1]
      messages = conversation_history[:-1] + [
          {
              'role': last_message['role'],
              'content': [
                  {
                      'type': 'text',
                      'text': last_message['content'],
                      'cache_control': {'type': 'ephemeral'}
                  }
              ]
          },
          {
              'role': 'user',
              'content': 'Can you give me a code example?'
          }
      ]

      response = openai.chat.completions.create(
          model='anthropic/claude-sonnet-4-6',
          messages=messages,
          max_tokens=1024
      )
      ```

      ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
      const conversationHistory = [
        { role: 'user', content: 'What is Python?' },
        { role: 'assistant', content: 'Python is a high-level...' },
        { role: 'user', content: 'What are its main features?' },
        { role: 'assistant', content: "Python's main features include..." },
        // ... more history
      ];

      // Mark last history message for caching
      const lastHistoryMessage = conversationHistory[conversationHistory.length - 1];

      const response = await openai.chat.completions.create({
        model: 'anthropic/claude-sonnet-4-6',
        messages: [
          ...conversationHistory.slice(0, -1),
          {
            ...lastHistoryMessage,
            content: [
              {
                type: 'text',
                text: lastHistoryMessage.content,
                cache_control: { type: 'ephemeral' },
              },
            ],
          },
          {
            role: 'user',
            content: 'Can you give me a code example?',
          },
        ],
        max_tokens: 1024,
      });
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="RAG with Document Collections">
    Cache retrieved documents for multiple queries in retrieval-augmented generation scenarios.

    <CodeGroup>
      ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
      curl -X POST https://api.orq.ai/v3/router/chat/completions \
        -H "Authorization: Bearer $ORQ_API_KEY" \
        -H "Content-Type: application/json" \
        -d '{
          "model": "anthropic/claude-sonnet-4-6",
          "messages": [
            {
              "role": "system",
              "content": "You are a helpful assistant that answers based on provided context."
            },
            {
              "role": "user",
              "content": [
                {
                  "type": "text",
                  "text": "Context:\n[Retrieved document content here...]",
                  "cache_control": { "type": "ephemeral" }
                },
                {
                  "type": "text",
                  "text": "Question: What is the main topic of these documents?"
                }
              ]
            }
          ],
          "max_tokens": 1024
        }'
      ```

      ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
      from openai import OpenAI
      import os

      openai = OpenAI(
          api_key=os.environ.get('ORQ_API_KEY'),
          base_url='https://api.orq.ai/v3/router'
      )

      # Retrieve documents from your vector store
      documents = retrieve_documents(query)
      context_text = '\n\n'.join([doc['content'] for doc in documents])

      response = openai.chat.completions.create(
          model='anthropic/claude-sonnet-4-6',
          messages=[
              {
                  'role': 'system',
                  'content': 'You are a helpful assistant that answers based on provided context.'
              },
              {
                  'role': 'user',
                  'content': [
                      {
                          'type': 'text',
                          'text': f'Context:\n{context_text}',
                          'cache_control': {'type': 'ephemeral'}
                      },
                      {
                          'type': 'text',
                          'text': f'Question: {user_question}'
                      }
                  ]
              }
          ],
          max_tokens=1024
      )
      ```

      ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
      const documents = await retrieveDocuments(query);
      const contextText = documents.map((d) => d.content).join('\n\n');

      const response = await openai.chat.completions.create({
        model: 'anthropic/claude-sonnet-4-6',
        messages: [
          {
            role: 'system',
            content:
              'You are a helpful assistant that answers based on provided context.',
          },
          {
            role: 'user',
            content: [
              {
                type: 'text',
                text: `Context:\n${contextText}`,
                cache_control: { type: 'ephemeral' },
              },
              {
                type: 'text',
                text: `Question: ${userQuestion}`,
              },
            ],
          },
        ],
        max_tokens: 1024,
      });
      ```
    </CodeGroup>
  </Accordion>
</AccordionGroup>

#### Extended Thinking

Enable deep reasoning for complex problems by allocating token budget for internal analysis before generating responses.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "anthropic/claude-opus-4-6",
      "messages": [
        {
          "role": "user",
          "content": "Design a distributed rate limiting system for 1M requests/second"
        }
      ],
      "thinking": {
        "type": "enabled",
        "budget_tokens": 8000
      },
      "max_tokens": 2048
    }'
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  openai = OpenAI(
      api_key=os.environ.get('ORQ_API_KEY'),
      base_url='https://api.orq.ai/v3/router'
  )

  response = openai.chat.completions.create(
      model='anthropic/claude-opus-4-6',
      messages=[
          {
              'role': 'user',
              'content': 'Design a distributed rate limiting system for 1M requests/second'
          }
      ],
      extra_body={
          'thinking': {
              'type': 'enabled',
              'budget_tokens': 8000
          }
      },
      max_tokens=2048
  )
  ```

  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-opus-4-6",
    messages: [
      {
        role: "user",
        content: "Design a distributed rate limiting system for 1M requests/second",
      },
    ],
    thinking: {
      type: "enabled",
      budget_tokens: 8000,
    },
    max_tokens: 2048,
  });
  ```
</CodeGroup>

<Accordion title="Multi-turn Extended Thinking">
  Include reasoning content with its signature when continuing conversations:

  <CodeGroup>
    ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
    const messages = [
      { role: "user", content: "Design a rate limiting system" }
    ];

    const response = await openai.chat.completions.create({
      model: "anthropic/claude-opus-4-6",
      messages,
      thinking: { type: "enabled", budget_tokens: 8000 },
      max_tokens: 2048
    });

    // Map response to assistant message
    const msg = response.choices[0].message;
    const contentParts = [];

    if (msg.reasoning) {
      contentParts.push({
        type: "reasoning",
        reasoning: msg.reasoning,
        signature: msg.reasoning_signature
      });
    }

    if (msg.redacted_reasoning) {
      contentParts.push({
        type: "redacted_reasoning",
        data: msg.redacted_reasoning
      });
    }

    if (msg.content) {
      contentParts.push({
        type: "text",
        text: msg.content
      });
    }

    const assistantMessage = {
      role: "assistant",
      content: contentParts
    };

    messages.push(assistantMessage);
    messages.push({ role: "user", content: "How would you handle 10M req/s?" });

    const followUp = await openai.chat.completions.create({
      model: "anthropic/claude-opus-4-6",
      messages,
      thinking: { type: "enabled", budget_tokens: 8000 },
      max_tokens: 2048
    });
    ```

    ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
    messages = [
      {"role": "user", "content": "Design a rate limiting system"}
    ]

    response = openai.chat.completions.create(
      model='anthropic/claude-opus-4-6',
      messages=messages,
      extra_body={
        'thinking': {
          'type': 'enabled',
          'budget_tokens': 8000
        }
      },
      max_tokens=2048
    )

    msg = response.choices[0].message
    content_parts = []

    if msg.reasoning:
      content_parts.append({
        "type": "reasoning",
        "reasoning": msg.reasoning,
        "signature": msg.reasoning_signature
      })

    if msg.redacted_reasoning:
      content_parts.append({
        "type": "redacted_reasoning",
        "data": msg.redacted_reasoning
      })

    if msg.content:
      content_parts.append({
        "type": "text",
        "text": msg.content
      })

    assistant_message = {
      "role": "assistant",
      "content": content_parts
    }

    messages.append(assistant_message)
    messages.append({"role": "user", "content": "How would you handle 10M req/s?"})

    follow_up = openai.chat.completions.create(
      model='anthropic/claude-opus-4-6',
      messages=messages,
      extra_body={
        'thinking': {
          'type': 'enabled',
          'budget_tokens': 8000
        }
      },
      max_tokens=2048
    )
    ```

    ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
    curl -X POST https://api.orq.ai/v3/router/chat/completions \
      -H "Authorization: Bearer $ORQ_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "anthropic/claude-opus-4-6",
        "messages": [
          {"role": "user", "content": "Design a rate limiting system"},
          {
            "role": "assistant",
            "content": [
              {
                "type": "reasoning",
                "reasoning": "...",
                "signature": "..."
              },
              {
                "type": "text",
                "text": "Here'\''s a distributed rate limiting design..."
              }
            ]
          },
          {"role": "user", "content": "How would you handle 10M req/s?"}
        ],
        "thinking": {"type": "enabled", "budget_tokens": 8000},
        "max_tokens": 2048
      }'
    ```
  </CodeGroup>

  <Warning>
    **Important**: Always include the `signature` field when passing reasoning content back to the API. The signature cryptographically verifies the reasoning was generated by the model and is required for multi-turn conversations.
  </Warning>
</Accordion>

<Accordion title="Combine with prompt caching for repeated contexts">
  Cache system prompts and context to reduce costs and latency when using extended thinking:

  ```typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-opus-4-6",
    messages: [
      {
        role: "system",
        content: [{
          type: "text",
          text: "You are a system architect...", // Cache this
          cache_control: { type: "ephemeral" }
        }]
      },
      { role: "user", content: "Design a notification system" }
    ],
    thinking: { type: "enabled", budget_tokens: 8000 }
  });
  ```
</Accordion>

**Configuration & Best Practices**

| Aspect                   | Guidance                | Details                                                      |
| ------------------------ | ----------------------- | ------------------------------------------------------------ |
| `thinking.type`          | Set to `"enabled"`      | Enables extended thinking with manual budget                 |
| `thinking.budget_tokens` | Set based on complexity | Min: 1024, must be \< `max_tokens`. Billed as output tokens. |

<Note>
  **Supported Models:** Extended thinking with `budget_tokens` is available on Claude Opus 4.5, Sonnet 4.5, and newer models. For Claude Opus 4.6 and Sonnet 4.6, consider using **adaptive thinking** instead (see below). Available through `anthropic/`, `aws/`, and `google/` providers.
</Note>

<Card title="Reasoning models" icon="brain" href="/docs/proxy/reasoning" horizontal>
  Configure `thinking.budget_tokens` and other extended thinking controls for Claude through the **AI Router**.
</Card>

#### Adaptive Thinking

Adaptive thinking is the recommended way to use extended thinking with **Claude Opus 4.6** and **Sonnet 4.6**. Instead of manually setting a thinking token budget, adaptive thinking lets Claude dynamically determine when and how much to think based on the complexity of each request.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "anthropic/claude-opus-4-6",
      "messages": [
        {
          "role": "user",
          "content": "Design a distributed rate limiting system for 1M requests/second"
        }
      ],
      "thinking": {
        "type": "adaptive"
      },
      "max_tokens": 16000
    }'
  ```

  ```python Python (OpenAI SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  openai = OpenAI(
      api_key=os.environ.get('ORQ_API_KEY'),
      base_url='https://api.orq.ai/v3/router'
  )

  response = openai.chat.completions.create(
      model='anthropic/claude-opus-4-6',
      messages=[
          {
              'role': 'user',
              'content': 'Design a distributed rate limiting system for 1M requests/second'
          }
      ],
      extra_body={
          'thinking': {
              'type': 'adaptive'
          }
      },
      max_tokens=16000
  )

  print(response.choices[0].message.content)
  ```

  ```typescript NodeJS/TypeScript (OpenAI SDK) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await openai.chat.completions.create({
    model: "anthropic/claude-opus-4-6",
    messages: [
      {
        role: "user",
        content: "Design a distributed rate limiting system for 1M requests/second",
      },
    ],
    thinking: {
      type: "adaptive",
    },
    max_tokens: 16000,
  });

  console.log(response.choices[0].message.content);
  ```
</CodeGroup>

**Adaptive vs Manual Thinking**

| Mode         | Config                                            | When to use                                                                                        |
| ------------ | ------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| **Adaptive** | `thinking: { type: "adaptive" }`                  | Recommended for Claude 4.6 models. Claude determines thinking depth automatically.                 |
| **Manual**   | `thinking: { type: "enabled", budget_tokens: N }` | When you need precise control over thinking token spend. Supported on all thinking-capable models. |
| **Disabled** | Omit `thinking` parameter                         | When you don't need extended thinking and want the lowest latency.                                 |

<Note>
  **Supported Models:** Adaptive thinking is available on **Claude Opus 4.6** and **Claude Sonnet 4.6** only. Older models (Opus 4.5, Sonnet 4.5, etc.) require `type: "enabled"` with `budget_tokens`.
</Note>

#### Vision Capabilities

All Claude 3+ models support image analysis with high accuracy. Choose between URL-based or base64-encoded images:

<Accordion title="Image from URL">
  Use images from URLs for remote files:

  ```typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [
      {
        role: "user",
        content: [
          { type: "text", text: "What's in this image?" },
          {
            type: "image_url",
            image_url: { url: "https://example.com/image.jpg" }
          },
        ],
      },
    ],
  });
  ```
</Accordion>

<Accordion title="Image from Base64">
  Embed images directly as base64-encoded strings:

  ```typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const imageBase64 = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==";

  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [
      {
        role: "user",
        content: [
          { type: "text", text: "What's in this image?" },
          {
            type: "image_url",
            image_url: { url: `data:image/jpeg;base64,${imageBase64}` }
          },
        ],
      },
    ],
  });
  ```
</Accordion>

#### PDF Input

Claude Opus 4.6 supports direct PDF analysis:

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-opus-4-6",
    messages: [
      {
        role: "user",
        content: [
          { type: "text", text: "Summarize this document" },
          {
            type: "document",
            document: {
              type: "pdf",
              url: "https://example.com/document.pdf"
            }
          },
        ],
      },
    ],
    max_tokens: 2048,
  });
  ```
</CodeGroup>

<Card title="Multimodal" icon="image" href="/docs/proxy/multimodal" horizontal>
  Full reference for image input, PDF input, image generation, and audio through the **AI Router**.
</Card>

#### Tool Use (Function Calling)

Claude excels at tool use with sophisticated planning and execution.

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
    tools: [
      {
        type: "function",
        function: {
          name: "get_weather",
          description: "Get current weather for a location",
          parameters: {
            type: "object",
            properties: {
              location: { type: "string" },
            },
            required: ["location"],
          },
        },
      },
    ],
  });
  ```
</CodeGroup>

<Card title="Tool Calling" icon="wrench" href="/docs/proxy/tool-calling" horizontal>
  Full reference for function tools, `tool_choice`, and streaming with tool calls through the **AI Router**.
</Card>

#### Multi-provider strategy

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Use Orq's fallback system for reliability
  const response = await openai.chat.completions.create({
    model: "anthropic/claude-sonnet-4-6",
    messages: [{ role: "user", content: "..." }],
    orq: {
      fallbacks: [
        { model: "aws/anthropic/claude-sonnet-4-6" },
        { model: "anthropic/claude-opus-4-6" },
      ],
    },
  });
  ```
</CodeGroup>

### Configuration

#### Model Parameters

| Parameter        | Type      | Description                           | Default |
| ---------------- | --------- | ------------------------------------- | ------- |
| `max_tokens`     | number    | Maximum tokens to generate (required) | -       |
| `temperature`    | number    | Randomness (0-1)                      | 1       |
| `top_p`          | number    | Nucleus sampling (0-1)                | -       |
| `top_k`          | number    | Top-K sampling                        | -       |
| `stop_sequences` | string\[] | Custom stop sequences                 | -       |

**Note**: `max_tokens` is required for Anthropic models. Typical values: 1024 for responses, 4096+ for long content.

<Warning>
  Do not use `temperature` and `top_p` together on newer Anthropic models. Using both parameters simultaneously will result in an API error. Choose one or the other.
</Warning>

#### Token Management

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Set appropriate max_tokens based on task
  const getMaxTokens = (taskType: string) => {
    const limits = {
      chat: 1024,
      summary: 500,
      generation: 4096,
      analysis: 2048,
    };
    return limits[taskType] || 1024;
  };
  ```
</CodeGroup>

### Troubleshooting

| Issue                | Problem                                           | Solution                                                                                                                  |
| -------------------- | ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| Missing `max_tokens` | Anthropic models require `max_tokens` parameter   | Add `max_tokens: 1024` (or appropriate value) to your request                                                             |
| High costs           | Token usage accumulates quickly on large requests | Enable prompt caching for repeated context, use smaller models (Haiku) for simple tasks, monitor and optimize token usage |
| Rate limits          | Anthropic has tiered rate limits based on usage   | Use Orq's automatic retries and fallbacks, or consider AWS/Google providers for higher limits                             |

#### Limitations

* **max\_tokens required**: Unlike OpenAI, must specify maximum output length
* **Rate limits**: Vary by tier and provider
* **Context window**: 200K tokens (may vary by provider)
* **System prompts**: Handled differently than OpenAI (automatically converted by Orq)

### Reference

* [Anthropic Documentation](https://docs.anthropic.com/)
* [Model Pricing](https://www.anthropic.com/pricing)
* [API Reference](https://docs.anthropic.com/en/api/messages)
* [Prompt Engineering Guide](https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview)
