Documentation Index
Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
The router supports two reasoning controls onPOST /chat/completions:
reasoning_effortfor OpenAI reasoning modelsthinkingfor Google Gemini and Anthropic extended thinking
The Go gateway mirrors the same contract internally through
models.ModelParameters.ReasoningEffort and
models.ModelParameters.Thinking.Request Fields
| Field | Type | Values | Notes |
|---|---|---|---|
reasoning_effort | string | none, minimal, low, medium, high, xhigh | OpenAI-style reasoning control |
thinking.type | string | enabled, disabled | Used by Google Gemini and Anthropic thinking paths |
thinking.budget_tokens | number | integer | Budget-based thinking |
thinking.thinking_level | string | low, high | Level-based thinking for Gemini 3 preview models |
Provider Behavior
OpenAI reasoning models
Usereasoning_effort on POST /chat/completions.
Current registry examples:
openai/o1openai/o1-proopenai/o3-miniopenai/o3openai/o3-pro
o1 and o3 entries in the model registry only advertise low, medium, and high. Model support is ultimately model-specific.
When
reasoning_effort is set, the router automatically drops temperature
and top_p before forwarding the request. These parameters are incompatible
with OpenAI reasoning models and will cause an error if sent directly.OpenAI
Set up your OpenAI API key and explore all supported models including the o1 and o3 families.
Google Gemini
Use thethinking object.
Level-based examples:
google/gemini-3-flash-previewgoogle/gemini-3-pro-preview
google/gemini-2.5-flashgoogle/gemini-2.5-flash-litegoogle/gemini-2.5-pro
thinking: { "type": "disabled" }is valid- On
thinking_enforcedmodels such asgoogle/gemini-2.5-pro, disabling thinking is coerced to a minimum budget of128 - On non-enforced Gemini models, disabling thinking becomes a budget of
0
Google AI
Set up your Google AI API key and explore Gemini 2.5 and Gemini 3 thinking models.
Anthropic Claude
OnPOST /chat/completions, Anthropic uses thinking: { type, budget_tokens }.
Current registry examples:
anthropic/claude-sonnet-4-20250514anthropic/claude-sonnet-4-5-20250929anthropic/claude-opus-4-5-20251101
- Anthropic chat completions only forward thinking when
typeisenabled budget_tokensmust be greater than0to be forwardedthinking_levelis not used for Anthropic chat completions
Anthropic
Set up your Anthropic API key and explore Claude extended thinking capabilities.
Responses API
If you callPOST /responses, use the OpenAI-style reasoning object instead of reasoning_effort.
Usage and Output
Reasoning token usage is returned underusage.completion_tokens_details.reasoning_tokens.
Code Examples
Choosing a Setting
Usereasoning_effort when the model is in the OpenAI o1 or o3 family. Use thinking_level for Gemini 3 preview models. Use budget_tokens for Anthropic and budget-based Gemini models.
If you need the current model catalog, use Supported Models.