Timeouts
This page describes features extending the AI Proxy, which provides a unified API for accessing multiple AI providers. To learn more, see AI Proxy.
Quick Start
Set maximum request duration to prevent hanging requests.
const response = await openai.chat.completions.create({
model: "openai/gpt-4o-mini",
messages: [{ role: "user", content: "Summarize AI trends for 2024" }],
orq: {
timeout: {
call_timeout: 30000, // 30 seconds
},
},
});
Configuration
Parameter | Type | Required | Description |
---|---|---|---|
call_timeout | number | Yes | Maximum execution time in milliseconds |
Timeout applies to:
- Request processing time
- Model generation time
- Network transfer time
- All fallback attempts (each gets same timeout)
Recommended Values
Use Case | Timeout (ms) | Reason |
---|---|---|
Chat applications | 15000 (15s) | User expectation for responses |
Real-time features | 5000 (5s) | Immediate feedback required |
Batch processing | 60000 (60s) | Complex analysis tasks |
Streaming responses | 30000 (30s) | Longer generation time |
Development/testing | 10000 (10s) | Fast iteration cycles |
Code examples
curl -X POST https://api.orq.ai/v2/proxy/chat/completions \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "SummarizeSummarize the latestlatest trendstrends inin artificialartificial intelligenceintelligence forfor 20242024"
}
],
"orq": {
"timeout": {
"call_timeout": 30000
}
}
}'
from openai import OpenAI
import os
openai = OpenAI(
api_key=os.environ.get("ORQ_API_KEY"),
base_url="https://api.orq.ai/v2/proxy"
)
response = openai.chat.completions.create(
model="openai/gpt-4o",
messages=[
{
"role": "user",
"content": "SummarizeSummarize the latestlatest trendstrends inin artificialartificial intelligenceintelligence forfor 20242024"
}
],
extra_body={
"orq": {
"timeout": {
"call_timeout": 30000
}
}
}
)
import OpenAI from "openai";
const openai = new OpenAI({
apiKey: process.env.ORQ_API_KEY,
baseURL: "https://api.orq.ai/v2/proxy",
});
const response = await openai.chat.completions.create({
model: "openai/gpt-4o",
messages: [
{
role: "user",
content:
"SummarizeSummarize the latestlatest trendstrends inin artificialartificial intelligenceintelligence forfor 20242024",
},
],
orq: {
timeout: {
call_timeout: 30000,
},
},
});
Error Handling
try {
const response = await openai.chat.completions.create({
model: "openai/gpt-4o",
messages: [...],
orq: {
timeout: {call_timeout: 15000}
}
});
} catch (error) {
if (error.code === 'TIMEOUT') {
console.log('Request timed out - try increasing timeout or using faster model');
// Implement fallback behavior
}
}
Best Practices
Timeout selection:
- Set based on user experience requirements
- Consider model complexity and prompt length
- Factor in network latency (add 2-5s buffer)
- Test with realistic prompts and data
Environment-specific timeouts:
const timeouts = {
development: 10000, // Fast feedback during dev
staging: 20000, // Realistic testing
production: 30000, // Conservative for reliability
};
Progressive timeouts:
// Start with short timeout, increase for retries
const attempts = [
{ timeout: 10000, model: "fast-model" },
{ timeout: 20000, model: "standard-model" },
{ timeout: 30000, model: "comprehensive-model" },
];
Fallback Integration
Timeouts work seamlessly with fallbacks:
{
orq: {
timeout: {call_timeout: 15000}, // Applied to each attempt
fallbacks: [
{model: "openai/gpt-4o"}, // Gets 15s timeout
{model: "openai/gpt-3.5-turbo"} // Also gets 15s timeout
]
}
}
Total possible time: timeout × (1 + fallback_count)
- Primary + 2 fallbacks with 15s timeout = up to 45s total
Troubleshooting
Frequent timeouts
- Increase timeout value
- Use faster models (gpt-3.5-turbo vs gpt-4)
- Reduce prompt complexity/length
- Check provider status for slowdowns
User experience issues
- Set timeout based on user expectations
- Show loading states for longer operations
- Implement progressive enhancement
- Consider async processing for long tasks
Performance optimization
// Monitor timeout patterns
const timeoutMetrics = {
averageResponseTime: 0,
timeoutRate: 0,
responseTimesByModel: {},
optimalTimeout: 0, // 95th percentile + buffer
};
Advanced Patterns
Dynamic timeout adjustment:
const getDynamicTimeout = (promptLength, modelComplexity) => {
const baseTimeout = 10000;
const promptFactor = Math.min(promptLength / 1000, 3); // Max 3x for long prompts
const modelFactor = modelComplexity === "simple" ? 1 : 2;
return baseTimeout * promptFactor * modelFactor;
};
Timeout with streaming:
{
stream: true,
orq: {
timeout: {
call_timeout: 30000 // Longer timeout for streaming
}
}
}
Circuit breaker pattern:
class CircuitBreaker {
constructor(timeout, failureThreshold = 5) {
this.timeout = timeout;
this.failureCount = 0;
this.failureThreshold = failureThreshold;
this.state = "CLOSED"; // CLOSED, OPEN, HALF_OPEN
}
async call(requestFn) {
if (this.state === "OPEN") {
throw new Error("Circuit breaker is OPEN");
}
try {
const result = await requestFn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
}
Limitations
- Fixed timeout: Same timeout applies to all requests
- No granular control: Cannot set different timeouts for different operations
- Fallback multiplication: Each fallback gets the same timeout duration
- Provider variations: Different providers have different baseline response times
- Streaming considerations: Streaming responses may need longer timeouts
Monitoring
Key metrics to track:
- Timeout rate: % of requests that timeout
- Average response time: Baseline performance
- 95th percentile: For setting optimal timeouts
- Timeout impact: User experience degradation
- Model performance: Response times by model
// Example monitoring
const monitorTimeouts = (responseTime, wasTimeout) => {
metrics.totalRequests++;
if (wasTimeout) {
metrics.timeouts++;
} else {
metrics.responseTimes.push(responseTime);
}
// Calculate optimal timeout (95th percentile + 5s buffer)
const p95 = calculatePercentile(metrics.responseTimes, 95);
metrics.recommendedTimeout = p95 + 5000;
};
Updated 4 days ago