> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Request timeouts

> Set maximum LLM request duration to prevent hanging calls. Configure timeouts per request for chat, batch processing, and streaming with automatic fallback.

**Use Cases**

* Preventing slow models from blocking user-facing requests indefinitely.
* Setting different limits for interactive (short) vs. batch (long) workloads.
* Triggering fallback logic when a provider exceeds an acceptable wait time.
* Enforcing response-time SLAs on latency-sensitive features.

***

## Quick Start

Set maximum request duration to prevent hanging requests.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/responses \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o-mini",
      "input": "Summarize AI trends for 2024",
      "timeout": {"call_timeout": 30000}
    }'
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.responses.create({
    model: "openai/gpt-4o-mini",
    input: "Summarize AI trends for 2024",
    timeout: {
      call_timeout: 30000,
    },
  });

  console.log(response.output_text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.responses.create(
      model="openai/gpt-4o-mini",
      input="Summarize AI trends for 2024",
      extra_body={"timeout": {"call_timeout": 30000}},
  )

  print(response.output_text)
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.chat.completions.create({
    model: "openai/gpt-4o-mini",
    messages: [{ role: "user", content: "Summarize AI trends for 2024" }],
    timeout: {
      call_timeout: 30000,
    },
  });
  ```
</CodeGroup>

## Configuration

| Parameter      | Type   | Required | Description                            |
| -------------- | ------ | -------- | -------------------------------------- |
| `call_timeout` | number | Yes      | Maximum execution time in milliseconds |

**Timeout applies to:**

* Request processing time.
* Model generation time.
* Network transfer time.
* All fallback attempts (each gets same timeout).

## Recommended Values

| Use Case                | Timeout (ms)  | Reason                         |
| ----------------------- | ------------- | ------------------------------ |
| **Chat applications**   | `15000` (15s) | User expectation for responses |
| **Real-time features**  | `5000` (5s)   | Immediate feedback required    |
| **Batch processing**    | `60000` (60s) | Complex analysis tasks         |
| **Streaming responses** | `30000` (30s) | Longer generation time         |
| **Development/testing** | `10000` (10s) | Fast iteration cycles          |

## Code examples

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/responses \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o-mini",
      "input": "Summarize the latest trends in artificial intelligence for 2024",
      "timeout": {"call_timeout": 30000}
    }'
  ```

  ```bash cURL (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o-mini",
      "messages": [
        {
          "role": "user",
          "content": "Summarize the latest trends in artificial intelligence for 2024"
        }
      ],
      "timeout": {"call_timeout": 30000}
    }'
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.responses.create({
    model: "openai/gpt-4o-mini",
    input: "Summarize the latest trends in artificial intelligence for 2024",
    timeout: {
      call_timeout: 30000,
    },
  });

  console.log(response.output_text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.responses.create(
      model="openai/gpt-4o-mini",
      input="Summarize the latest trends in artificial intelligence for 2024",
      extra_body={"timeout": {"call_timeout": 30000}},
  )

  print(response.output_text)
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.chat.completions.create({
    model: "openai/gpt-4o-mini",
    messages: [
      {
        role: "user",
        content: "Summarize the latest trends in artificial intelligence for 2024",
      },
    ],
    timeout: {
      call_timeout: 30000,
    },
  });
  ```

  ```python Python (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.chat.completions.create(
      model="openai/gpt-4o-mini",
      messages=[
          {
              "role": "user",
              "content": "Summarize the latest trends in artificial intelligence for 2024",
          }
      ],
      extra_body={"timeout": {"call_timeout": 30000}},
  )
  ```
</CodeGroup>

## Error Handling

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  try {
    const response = await client.responses.create({
      model: "openai/gpt-4o",
      input: "Explain quantum computing",
      timeout: { call_timeout: 15000 },
    });
    console.log(response.output_text);
  } catch (error) {
    if (error instanceof OpenAI.APIConnectionTimeoutError) {
      console.log('Request timed out - try increasing timeout or using faster model');
    }
  }
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  try {
    const response = await client.chat.completions.create({
      model: "openai/gpt-4o",
      messages: [{ role: "user", content: "Explain quantum computing" }],
      timeout: { call_timeout: 15000 }
    });
  } catch (error) {
    if (error instanceof OpenAI.APIConnectionTimeoutError) {
      console.log('Request timed out - try increasing timeout or using faster model');
      // Implement fallback behavior
    }
  }
  ```
</CodeGroup>

## Best Practices

**Timeout selection:**

* Set based on user experience requirements.
* Consider model complexity and prompt length.
* Factor in network latency (add 2-5s buffer).
* Test with realistic prompts and data.
  **Environment-specific timeouts:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const timeouts = {
    development: 10000, // Fast feedback during dev
    staging: 20000, // Realistic testing
    production: 30000, // Conservative for reliability
  };
  ```
</CodeGroup>

**Progressive timeouts:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Start with short timeout, increase for retries
  const attempts = [
    { timeout: 10000, model: "fast-model" },
    { timeout: 20000, model: "standard-model" },
    { timeout: 30000, model: "comprehensive-model" },
  ];
  ```
</CodeGroup>

## Fallback Integration

Timeouts work seamlessly with fallbacks:

<CodeGroup>
  ```json JSON theme={"theme":{"light":"github-light","dark":"github-dark"}}
  {
    "timeout": { "call_timeout": 15000 },
    "fallbacks": [
      { "model": "openai/gpt-4o" },
      { "model": "openai/gpt-5-mini" }
    ]
  }
  ```
</CodeGroup>

**Total possible time:** `timeout × (1 + fallback_count)`

* Primary + 2 fallbacks with 15s timeout = up to 45s total.

## Troubleshooting

**Frequent timeouts**

* Increase timeout value.

* Use faster models (gpt-5-mini vs gpt-5).

* Reduce prompt complexity/length.

* Check provider status for slowdowns.
  **User experience issues**

* Set timeout based on user expectations.

* Show loading states for longer operations.

* Implement progressive enhancement.

* Consider async processing for long tasks.
  **Performance optimization**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Monitor timeout patterns
  const timeoutMetrics = {
    averageResponseTime: 0,
    timeoutRate: 0,
    responseTimesByModel: {},
    optimalTimeout: 0, // 95th percentile + buffer
  };
  ```
</CodeGroup>

## Advanced Patterns

**Dynamic timeout adjustment:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const getDynamicTimeout = (promptLength, modelComplexity) => {
    const baseTimeout = 10000;
    const promptFactor = Math.min(promptLength / 1000, 3); // Max 3x for long prompts
    const modelFactor = modelComplexity === "simple" ? 1 : 2;

    return baseTimeout * promptFactor * modelFactor;
  };
  ```
</CodeGroup>

**Timeout with streaming:**

<CodeGroup>
  ```json JSON theme={"theme":{"light":"github-light","dark":"github-dark"}}
  {
    "stream": true,
    "timeout": {
      "call_timeout": 30000
    }
  }
  ```
</CodeGroup>

**Circuit breaker pattern:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  class CircuitBreaker {
    timeout: number;
    failureCount: number;
    failureThreshold: number;
    state: "CLOSED" | "OPEN" | "HALF_OPEN";

    constructor(timeout: number, failureThreshold = 5) {
      this.timeout = timeout;
      this.failureCount = 0;
      this.failureThreshold = failureThreshold;
      this.state = "CLOSED";
    }

    async call(requestFn) {
      if (this.state === "OPEN") {
        throw new Error("Circuit breaker is OPEN");
      }

      try {
        const result = await requestFn();
        this.onSuccess();
        return result;
      } catch (error) {
        this.onFailure();
        throw error;
      }
    }
  }
  ```
</CodeGroup>

## Limitations

* **Fixed timeout**: Same timeout applies to all requests.
* **No granular control**: Cannot set different timeouts for different operations.
* **Fallback multiplication**: Each fallback gets the same timeout duration.
* **Provider variations**: Different providers have different baseline response times.
* **Streaming considerations**: Streaming responses may need longer timeouts.

## Monitoring

Key metrics to track:

* **Timeout rate**: % of requests that timeout.
* **Average response time**: Baseline performance.
* **95th percentile**: For setting optimal timeouts.
* **Timeout impact**: User experience degradation.
* **Model performance**: Response times by model.

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  // Example monitoring
  const metrics = {
    totalRequests: 0,
    timeouts: 0,
    responseTimes: [] as number[],
    recommendedTimeout: 0,
  };

  const calculatePercentile = (arr: number[], p: number): number => {
    const sorted = [...arr].sort((a, b) => a - b);
    return sorted[Math.floor((p / 100) * sorted.length)] ?? 0;
  };

  const monitorTimeouts = (responseTime: number, wasTimeout: boolean) => {
    metrics.totalRequests++;
    if (wasTimeout) {
      metrics.timeouts++;
    } else {
      metrics.responseTimes.push(responseTime);
    }

    // Calculate optimal timeout (95th percentile + 5s buffer)
    const p95 = calculatePercentile(metrics.responseTimes, 95);
    metrics.recommendedTimeout = p95 + 5000;
  };
  ```
</CodeGroup>
