This page describes features extending the AI Gateway, which provides a unified API for accessing multiple AI providers. To learn more, see AI Gateway.
Quick Start
Enable step-by-step reasoning for complex problems and analysis.Configuration by Provider
OpenAI Models (o1 series)
| Parameter | Type | Values | Description | 
|---|---|---|---|
| reasoning_effort | string | low,medium,high | Depth of reasoning process | 
- openai/o1-preview
- openai/o1-mini
- openai/o3-mini
Other Models
| Parameter | Type | Description | 
|---|---|---|
| thinking.type | "enabled" | Enable reasoning capability | 
| thinking.budget_tokens | number | Max tokens for reasoning (1000-10000) | 
- google/gemini-2.5-pro
- anthropic/claude-3-5-sonnet
- Other compatible models
Reasoning Effort Levels
| Level | Use Case | Processing Time | Accuracy | 
|---|---|---|---|
| low | Simple calculations, basic logic | ~10s | Good | 
| medium | Multi-step problems, analysis | ~30s | Better | 
| high | Complex reasoning, research tasks | ~60s+ | Best | 
Use Cases
| Problem Type | Recommended Settings | Example | 
|---|---|---|
| Math problems | mediumeffort | ”Calculate compound interest over 10 years” | 
| Logic puzzles | higheffort | ”Solve this Sudoku puzzle” | 
| Code debugging | mediumeffort | ”Find the bug in this Python function” | 
| Strategic planning | higheffort | ”Create a business plan for a SaaS startup” | 
| Data analysis | medium-higheffort | ”Analyze trends in this sales data” | 
Code examples
Response Structure
OpenAI reasoning models return:Best Practices
Effort/budget selection:Performance Considerations
Response times:- loweffort: 5-15 seconds
- mediumeffort: 15-45 seconds
- higheffort: 45-120 seconds
Troubleshooting
**Slow responses- Use lower reasoning effort for time-sensitive applications
- Consider async processing for complex reasoning tasks
- Implement timeouts appropriate for reasoning models
- Monitor reasoning token consumption
- Adjust budget_tokens for non-OpenAI models
- Use lower effort levels when appropriate
- Increase reasoning effort/budget for complex problems
- Improve prompt specificity and clarity
- Try different reasoning-capable models
Advanced Patterns
Conditional reasoning
Progressive reasoning
Reasoning with fallbacks
Limitations
- Response time: Reasoning adds significant latency to generation
- Cost: Reasoning tokens are charged at higher rates
- Model availability: Limited to specific reasoning-capable models
- Token limits: Reasoning may hit context limits faster
- Determinism: Reasoning output may vary between requests