Quick Start
The router supports two reasoning controls onPOST /chat/completions:
reasoning_effortfor OpenAI reasoning modelsthinkingfor Google Gemini and Anthropic extended thinking
The Go gateway mirrors the same contract internally through
models.ModelParameters.ReasoningEffort and
models.ModelParameters.Thinking.Request Fields
| Field | Type | Values | Notes |
|---|---|---|---|
reasoning_effort | string | none, minimal, low, medium, high, xhigh | OpenAI-style reasoning control |
thinking.type | string | enabled, disabled | Used by Google Gemini and Anthropic thinking paths |
thinking.budget_tokens | number | integer | Budget-based thinking |
thinking.thinking_level | string | low, high | Level-based thinking for Gemini 3 preview models |
Provider Behavior
OpenAI reasoning models
Usereasoning_effort on POST /chat/completions.
Current registry examples:
openai/o1openai/o1-proopenai/o3-miniopenai/o3openai/o3-pro
o1 and o3 entries in the model registry only advertise low, medium, and high. Model support is ultimately model-specific.
Google Gemini
Use thethinking object.
Level-based examples:
google/gemini-3-flash-previewgoogle/gemini-3-pro-preview
google/gemini-2.5-flashgoogle/gemini-2.5-flash-litegoogle/gemini-2.5-pro
thinking: { "type": "disabled" }is valid- On
thinking_enforcedmodels such asgoogle/gemini-2.5-pro, disabling thinking is coerced to a minimum budget of128 - On non-enforced Gemini models, disabling thinking becomes a budget of
0
Anthropic Claude
OnPOST /chat/completions, Anthropic uses thinking: { type, budget_tokens }.
Current registry examples:
anthropic/claude-sonnet-4-20250514anthropic/claude-sonnet-4-5-20250929anthropic/claude-opus-4-5-20251101
- Anthropic chat completions only forward thinking when
typeisenabled budget_tokensmust be greater than0to be forwardedthinking_levelis not used for Anthropic chat completions
Responses API
If you callPOST /responses, use the OpenAI-style reasoning object instead of reasoning_effort.
Usage and Output
Reasoning token usage is returned underusage.completion_tokens_details.reasoning_tokens.
Code Examples
Choosing a Setting
Usereasoning_effort when the model is in the OpenAI o1 or o3 family. Use thinking_level for Gemini 3 preview models. Use budget_tokens for Anthropic and budget-based Gemini models.
If you need the current model catalog, use Supported Models.