> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Request timeouts

> Set maximum LLM request duration to prevent hanging calls. Configure timeouts per request for chat, batch processing, and streaming with automatic fallback.

## Quick Start

Set maximum request duration to prevent hanging requests.

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await openai.chat.completions.create({
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Summarize AI trends for 2024" }],
    timeout: {
      call_timeout: 30000, // 30 seconds
    },
  });
  ```
</CodeGroup>

## Configuration

| Parameter      | Type   | Required | Description                            |
| -------------- | ------ | -------- | -------------------------------------- |
| `call_timeout` | number | Yes      | Maximum execution time in milliseconds |

**Timeout applies to:**

* Request processing time
* Model generation time
* Network transfer time
* All fallback attempts (each gets same timeout)

## Recommended Values

| Use Case                | Timeout (ms)  | Reason                         |
| ----------------------- | ------------- | ------------------------------ |
| **Chat applications**   | `15000` (15s) | User expectation for responses |
| **Real-time features**  | `5000` (5s)   | Immediate feedback required    |
| **Batch processing**    | `60000` (60s) | Complex analysis tasks         |
| **Streaming responses** | `30000` (30s) | Longer generation time         |
| **Development/testing** | `10000` (10s) | Fast iteration cycles          |

## Code examples

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o-mini",
      "messages": [
        {
          "role": "user",
          "content": "Summarize the latest trends in artificial intelligence for 2024"
        }
      ],
      "timeout": {
        "call_timeout": 30000
      }
    }'
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  openai = OpenAI(
    api_key=os.environ.get("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router"
  )

  response = openai.chat.completions.create(
      model="openai/gpt-4o",
      messages=[
          {
              "role": "user",
              "content": "Summarize the latest trends in artificial intelligence for 2024"
          }
      ],
      extra_body={
          "timeout": {
              "call_timeout": 30000
          }
      }
  )
  ```

  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const openai = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await openai.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [
      {
        role: "user",
        content: "Summarize the latest trends in artificial intelligence for 2024",
      },
    ],
    timeout: {
      call_timeout: 30000,
    },
  });
  ```
</CodeGroup>

## Error Handling

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  try {
    const response = await openai.chat.completions.create({
      model: "openai/gpt-4o",
      messages: [...],
      timeout: { call_timeout: 15000 }
    });
  } catch (error) {
    if (error.code === 'TIMEOUT') {
      console.log('Request timed out - try increasing timeout or using faster model');
      // Implement fallback behavior
    }
  }
  ```
</CodeGroup>

## Best Practices

**Timeout selection:**

* Set based on user experience requirements
* Consider model complexity and prompt length
* Factor in network latency (add 2-5s buffer)
* Test with realistic prompts and data

**Environment-specific timeouts:**

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const timeouts = {
    development: 10000, // Fast feedback during dev
    staging: 20000, // Realistic testing
    production: 30000, // Conservative for reliability
  };
  ```
</CodeGroup>

**Progressive timeouts:**

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Start with short timeout, increase for retries
  const attempts = [
    { timeout: 10000, model: "fast-model" },
    { timeout: 20000, model: "standard-model" },
    { timeout: 30000, model: "comprehensive-model" },
  ];
  ```
</CodeGroup>

## Fallback Integration

Timeouts work seamlessly with fallbacks:

<CodeGroup>
  ```json JSON theme={"theme":{"light":"github-light","dark":"github-dark"}}
  {
    "timeout": { "call_timeout": 15000 },
    "fallbacks": [
      { "model": "openai/gpt-4o" },
      { "model": "openai/gpt-3.5-turbo" }
    ]
  }
  ```
</CodeGroup>

**Total possible time:** `timeout × (1 + fallback_count)`

* Primary + 2 fallbacks with 15s timeout = up to 45s total

## Troubleshooting

**Frequent timeouts**

* Increase timeout value
* Use faster models (gpt-3.5-turbo vs gpt-4)
* Reduce prompt complexity/length
* Check provider status for slowdowns

**User experience issues**

* Set timeout based on user expectations
* Show loading states for longer operations
* Implement progressive enhancement
* Consider async processing for long tasks

**Performance optimization**

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Monitor timeout patterns
  const timeoutMetrics = {
    averageResponseTime: 0,
    timeoutRate: 0,
    responseTimesByModel: {},
    optimalTimeout: 0, // 95th percentile + buffer
  };
  ```
</CodeGroup>

## Advanced Patterns

**Dynamic timeout adjustment:**

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const getDynamicTimeout = (promptLength, modelComplexity) => {
    const baseTimeout = 10000;
    const promptFactor = Math.min(promptLength / 1000, 3); // Max 3x for long prompts
    const modelFactor = modelComplexity === "simple" ? 1 : 2;

    return baseTimeout * promptFactor * modelFactor;
  };
  ```
</CodeGroup>

**Timeout with streaming:**

<CodeGroup>
  ```json JSON theme={"theme":{"light":"github-light","dark":"github-dark"}}
  {
    "stream": true,
    "timeout": {
      "call_timeout": 30000
    }
  }
  ```
</CodeGroup>

**Circuit breaker pattern:**

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  class CircuitBreaker {
    constructor(timeout, failureThreshold = 5) {
      this.timeout = timeout;
      this.failureCount = 0;
      this.failureThreshold = failureThreshold;
      this.state = "CLOSED"; // CLOSED, OPEN, HALF_OPEN
    }

    async call(requestFn) {
      if (this.state === "OPEN") {
        throw new Error("Circuit breaker is OPEN");
      }

      try {
        const result = await requestFn();
        this.onSuccess();
        return result;
      } catch (error) {
        this.onFailure();
        throw error;
      }
    }
  }
  ```
</CodeGroup>

## Limitations

* **Fixed timeout**: Same timeout applies to all requests
* **No granular control**: Cannot set different timeouts for different operations
* **Fallback multiplication**: Each fallback gets the same timeout duration
* **Provider variations**: Different providers have different baseline response times
* **Streaming considerations**: Streaming responses may need longer timeouts

## Monitoring

Key metrics to track:

* **Timeout rate**: % of requests that timeout
* **Average response time**: Baseline performance
* **95th percentile**: For setting optimal timeouts
* **Timeout impact**: User experience degradation
* **Model performance**: Response times by model

<CodeGroup>
  ```typescript Typescript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Example monitoring
  const monitorTimeouts = (responseTime, wasTimeout) => {
    metrics.totalRequests++;
    if (wasTimeout) {
      metrics.timeouts++;
    } else {
      metrics.responseTimes.push(responseTime);
    }

    // Calculate optimal timeout (95th percentile + 5s buffer)
    const p95 = calculatePercentile(metrics.responseTimes, 95);
    metrics.recommendedTimeout = p95 + 5000;
  };
  ```
</CodeGroup>
