This page describes features extending the AI Gateway, which provides a unified API for accessing multiple AI providers. To learn more, see AI Gateway.
Quick Start
Set maximum request duration to prevent hanging requests.Configuration
| Parameter | Type | Required | Description | 
|---|---|---|---|
| call_timeout | number | Yes | Maximum execution time in milliseconds | 
- Request processing time
- Model generation time
- Network transfer time
- All fallback attempts (each gets same timeout)
Recommended Values
| Use Case | Timeout (ms) | Reason | 
|---|---|---|
| Chat applications | 15000(15s) | User expectation for responses | 
| Real-time features | 5000(5s) | Immediate feedback required | 
| Batch processing | 60000(60s) | Complex analysis tasks | 
| Streaming responses | 30000(30s) | Longer generation time | 
| Development/testing | 10000(10s) | Fast iteration cycles | 
Code examples
Error Handling
Best Practices
Timeout selection:- Set based on user experience requirements
- Consider model complexity and prompt length
- Factor in network latency (add 2-5s buffer)
- Test with realistic prompts and data
Fallback Integration
Timeouts work seamlessly with fallbacks:timeout × (1 + fallback_count)
- Primary + 2 fallbacks with 15s timeout = up to 45s total
Troubleshooting
**Frequent timeouts- Increase timeout value
- Use faster models (gpt-3.5-turbo vs gpt-4)
- Reduce prompt complexity/length
- Check provider status for slowdowns
- Set timeout based on user expectations
- Show loading states for longer operations
- Implement progressive enhancement
- Consider async processing for long tasks
Advanced Patterns
Dynamic timeout adjustment:Limitations
- Fixed timeout: Same timeout applies to all requests
- No granular control: Cannot set different timeouts for different operations
- Fallback multiplication: Each fallback gets the same timeout duration
- Provider variations: Different providers have different baseline response times
- Streaming considerations: Streaming responses may need longer timeouts
Monitoring
Key metrics to track:- Timeout rate: % of requests that timeout
- Average response time: Baseline performance
- 95th percentile: For setting optimal timeouts
- Timeout impact: User experience degradation
- Model performance: Response times by model