Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

This page describes features extending the AI Gateway, which provides a unified API for accessing multiple AI providers. To learn more, see AI Gateway.

Quick Start

Set maximum request duration to prevent hanging requests.

const response = await openai.chat.completions.create({
  model: "openai/gpt-4o-mini",
  messages: [{ role: "user", content: "Summarize AI trends for 2024" }],
  orq: {
    timeout: {
      callTimeout: 30000, // 30 seconds
    },
  },
});

Configuration

Parameter	Type	Required	Description
`call_timeout`	number	Yes	Maximum execution time in milliseconds

Timeout applies to:

Request processing time
Model generation time
Network transfer time
All fallback attempts (each gets same timeout)

Recommended Values

Use Case	Timeout (ms)	Reason
Chat applications	`15000` (15s)	User expectation for responses
Real-time features	`5000` (5s)	Immediate feedback required
Batch processing	`60000` (60s)	Complex analysis tasks
Streaming responses	`30000` (30s)	Longer generation time
Development/testing	`10000` (10s)	Fast iteration cycles

Code examples

curl -X POST https://api.orq.ai/v2/proxy/chat/completions \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "SummarizeSummarize the latestlatest trendstrends inin artificialartificial intelligenceintelligence forfor 20242024"
      }
    ],
    "orq": {
      "timeout": {
        "call_timeout": 30000
      }
    }
  }'

Error Handling

try {
  const response = await openai.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [...],
    orq: {
      timeout: {callTimeout: 15000}
    }
  });
} catch (error) {
  if (error.code === 'TIMEOUT') {
    console.log('Request timed out - try increasing timeout or using faster model');
    // Implement fallback behavior
  }
}

Best Practices

Timeout selection:

Set based on user experience requirements
Consider model complexity and prompt length
Factor in network latency (add 2-5s buffer)
Test with realistic prompts and data

Environment-specific timeouts:

const timeouts = {
  development: 10000, // Fast feedback during dev
  staging: 20000, // Realistic testing
  production: 30000, // Conservative for reliability
};

Progressive timeouts:

// Start with short timeout, increase for retries
const attempts = [
  { timeout: 10000, model: "fast-model" },
  { timeout: 20000, model: "standard-model" },
  { timeout: 30000, model: "comprehensive-model" },
];

Fallback Integration

Timeouts work seamlessly with fallbacks:

{
  orq: {
    timeout: {callTimeout: 15000},  // Applied to each attempt
    fallbacks: [
      {model: "openai/gpt-4o"},      // Gets 15s timeout
      {model: "openai/gpt-3.5-turbo"} // Also gets 15s timeout
    ]
  }
}

Total possible time: timeout × (1 + fallback_count)

Primary + 2 fallbacks with 15s timeout = up to 45s total

Troubleshooting

**Frequent timeouts

Increase timeout value
Use faster models (gpt-3.5-turbo vs gpt-4)
Reduce prompt complexity/length
Check provider status for slowdowns

**User experience issues

Set timeout based on user expectations
Show loading states for longer operations
Implement progressive enhancement
Consider async processing for long tasks

**Performance optimization

// Monitor timeout patterns
const timeoutMetrics = {
  averageResponseTime: 0,
  timeoutRate: 0,
  responseTimesByModel: {},
  optimalTimeout: 0, // 95th percentile + buffer
};

Advanced Patterns

Dynamic timeout adjustment:

const getDynamicTimeout = (promptLength, modelComplexity) => {
  const baseTimeout = 10000;
  const promptFactor = Math.min(promptLength / 1000, 3); // Max 3x for long prompts
  const modelFactor = modelComplexity === "simple" ? 1 : 2;

  return baseTimeout * promptFactor * modelFactor;
};

Timeout with streaming:

{
  stream: true,
  orq: {
    timeout: {
      callTimeout: 30000  // Longer timeout for streaming
    }
  }
}

Circuit breaker pattern:

class CircuitBreaker {
  constructor(timeout, failureThreshold = 5) {
    this.timeout = timeout;
    this.failureCount = 0;
    this.failureThreshold = failureThreshold;
    this.state = "CLOSED"; // CLOSED, OPEN, HALF_OPEN
  }

  async call(requestFn) {
    if (this.state === "OPEN") {
      throw new Error("Circuit breaker is OPEN");
    }

    try {
      const result = await requestFn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }
}

Limitations

Fixed timeout: Same timeout applies to all requests
No granular control: Cannot set different timeouts for different operations
Fallback multiplication: Each fallback gets the same timeout duration
Provider variations: Different providers have different baseline response times
Streaming considerations: Streaming responses may need longer timeouts

Monitoring

Key metrics to track:

Timeout rate: % of requests that timeout
Average response time: Baseline performance
95th percentile: For setting optimal timeouts
Timeout impact: User experience degradation
Model performance: Response times by model

// Example monitoring
const monitorTimeouts = (responseTime, wasTimeout) => {
  metrics.totalRequests++;
  if (wasTimeout) {
    metrics.timeouts++;
  } else {
    metrics.responseTimes.push(responseTime);
  }

  // Calculate optimal timeout (95th percentile + 5s buffer)
  const p95 = calculatePercentile(metrics.responseTimes, 95);
  metrics.recommendedTimeout = p95 + 5000;
};

Getting Started

Reference

Administer

Timeouts

Quick Start

Configuration

Recommended Values

Code examples

Error Handling

Best Practices

Fallback Integration

Troubleshooting

Advanced Patterns

Limitations

Monitoring

Getting Started

Reference

Administer

​Quick Start

​Configuration

​Recommended Values

​Code examples

​Error Handling

​Best Practices

​Fallback Integration

​Troubleshooting

​Advanced Patterns

​Limitations

​Monitoring

Quick Start

Configuration

Recommended Values

Code examples

Error Handling

Best Practices

Fallback Integration

Troubleshooting

Advanced Patterns

Limitations

Monitoring