reasoning_effort, while Google Gemini and Anthropic use a thinking object. The AI Gateway accepts all three controls and normalizes values to what each model actually supports before forwarding the request.
| Provider | Control | Values |
|---|---|---|
| OpenAI o-series | reasoning_effort | none, minimal, low, medium, high, xhigh |
| Google Gemini 3 preview | thinking.thinking_level | low, high |
| Google Gemini 2.5 | thinking.budget_tokens | integer |
| Anthropic Claude | thinking.budget_tokens | integer |
- Problems requiring multi-step logical deduction (math proofs, code debugging, planning).
- Complex analysis where a standard model produces shallow or incorrect results.
- Research tasks where depth of reasoning matters more than response speed.
- Benchmarking reasoning quality across providers on identical prompts.
Quick Start
The AI Gateway supports three reasoning controls:reasoningobject onPOST /responsesfor OpenAI reasoning models.reasoning_effortonPOST /chat/completionsfor OpenAI reasoning models.thinkingonPOST /chat/completionsfor Google Gemini and Anthropic extended thinking.
Request Fields
| Field | Type | Values | Notes |
|---|---|---|---|
reasoning_effort | string | none, minimal, low, medium, high, xhigh | OpenAI-style reasoning control |
thinking.type | string | enabled, disabled | Used by Google Gemini and Anthropic thinking paths |
thinking.budget_tokens | number | integer | Budget-based thinking |
thinking.thinking_level | string | low, high | Level-based thinking for Gemini 3 preview models |
Provider Behavior
OpenAI reasoning models
Usereasoning_effort on POST /chat/completions.
Current registry examples:
openai/o1.openai/o1-pro.openai/o3-mini.openai/o3.openai/o3-pro. The AI Gateway schema accepts all six enum values, but model support is ultimately model-specific.
The router normalizes
reasoning_effort to the nearest value a model supports before forwarding the request. For example, openai/gpt-5.4 does not support xhigh: it maps to high. Models that do support xhigh receive the value as-is.When
reasoning_effort is set, the AI Gateway automatically drops temperature
and top_p before forwarding the request. These parameters are incompatible
with OpenAI reasoning models and will cause an error if sent directly.OpenAI
Set up your OpenAI API key and explore all supported models including the o1 and o3 families.
Google Gemini
Use thethinking object.
Level-based (thinking_level) examples:
-
google/gemini-3-flash-preview. -
google/gemini-3-pro-preview. Budget-based (budget_tokens) examples: -
google/gemini-2.5-flash. -
google/gemini-2.5-flash-lite. -
google/gemini-2.5-pro. Router behavior: -
thinking: { "type": "disabled" }is valid -
On
thinking_enforcedmodels such asgoogle/gemini-2.5-pro, disabling thinking is coerced to a minimum budget of128 -
On non-enforced Gemini models, disabling thinking becomes a budget of
0
Google AI
Set up your Google AI API key and explore Gemini 2.5 and Gemini 3 thinking models.
Anthropic Claude
OnPOST /chat/completions, Anthropic uses thinking: { type, budget_tokens }.
Current registry examples:
-
anthropic/claude-sonnet-4-20250514. -
anthropic/claude-sonnet-4-5-20250929. -
anthropic/claude-opus-4-5-20251101. Router behavior: -
Anthropic chat completions only forward thinking when
typeisenabled. -
budget_tokensmust be greater than0to be forwarded. -
thinking_levelis not used for Anthropic chat completions.
Anthropic
Set up your Anthropic API key and explore Claude extended thinking capabilities.
Responses API
POST /responses supports reasoning for OpenAI models only. Use the OpenAI-style reasoning object with effort instead of reasoning_effort.
thinking (Anthropic and Google Gemini) is not supported on the /responses endpoint. Use POST /chat/completions for Anthropic and Google reasoning models.
Usage and Output
Reasoning token usage is returned underusage.completion_tokens_details.reasoning_tokens.
Code Examples
Choosing a Setting
Usereasoning_effort when the model is in the OpenAI o1 or o3 family. Use thinking_level for Gemini 3 preview models. Use budget_tokens for Anthropic and budget-based Gemini models.
If you need the current model catalog, use Supported Models.