Auto Router | Intelligent routing in AI Gateway

Not every request needs the most expensive model. The Auto Router automatically routes each request to the optimal model based on the chosen profile, to reduce costs without sacrificing quality. Configure a Strong Model and an Economical Model, then choose how aggressively to route between them. The Auto Router evaluates each incoming request and sends it to whichever model best matches the task complexity and the optimization goal. Two directions are supported:

Optimize for cost: set a high-quality model as the baseline. The Auto Router routes simpler requests to the cheaper model and escalates only when complexity warrants it. This saves on requests that don’t need the most powerful model.
Optimize for quality: start with a cost-efficient model and let the Auto Router escalate to the more capable model only when the task demands it. Get the best output for every request without overspending.

Use Cases

Scenario	Setup	Outcome
Customer support chatbot	Strong: Claude Opus / Economical: Gemini Flash	Simple FAQs and acknowledgements go to the fast model; nuanced complaints or policy questions escalate automatically
Document summarization pipeline	Strong: GPT-4o / Economical: GPT-4o Mini	Short documents with clear structure route to the mini model; long, dense, or ambiguous documents go to the full model
Code assistant	Strong: Claude Sonnet / Economical: Gemini Flash	Autocomplete and boilerplate generation stay cheap; debugging, architecture questions, and multi-file reasoning escalate
Content generation at scale	Strong: GPT-5.1 / Economical: GPT-4o Mini	High-volume social copy and templated content uses the cheaper model; long-form articles or brand-sensitive copy uses the stronger one
Internal Q&A over documents	Strong: Claude Opus / Economical: Claude Haiku	Retrieval-augmented lookups with clear answers route to Haiku; open-ended synthesis or conflicting sources escalate to Opus

How It Works

The Auto Router sits between the application and two models: a Strong Model for complex requests and an Economical Model for simpler ones. When a request comes in, it analyzes the task complexity and routes it to the appropriate model based on the configured profile.

Set Up the Auto Router

Navigate to the Models page in AI Gateway.
Click Add Model.
Select Auto Router from the dropdown.
Fill in the configuration:
- Model ID: a unique identifier for this router (lowercase letters, numbers, and hyphens only).
- Strong Model: the more capable model, used for complex requests.
- Economical Model: the cheaper model, used for simpler requests.
- Profile: choose how aggressively to route between the two models.
Click Add model.

Profiles

Profile	Behavior
Quality	Prioritizes the Strong Model for more requests
Balanced	Balances cost and quality across simple and complex requests
Cost	Prefers the Economical Model more aggressively to save money

Recommended model pairs

These pairs combine high routing accuracy with significant cost ratios (over 10x), making them effective starting points.

Strong Model	Economical Model
Google Gemini 2.5 Pro	Google Gemini 2.5 Flash
OpenAI GPT-5.1	OpenAI GPT-4o Mini
Anthropic Claude Opus 4	Google Gemini 2.5 Flash
OpenAI GPT-4o	OpenAI GPT-4o Mini

Models from the same family or tier work well together (e.g. Claude Sonnet and Gemini Flash). Very large capability gaps reduce the effectiveness of routing.

Models from different providers can be combined in a single Auto Router configuration.

Use the Auto Router

Once created, the Auto Router appears in AI Gateway and can be referenced anywhere a model is accepted via the API or SDKs.

Reference in code

When using an Auto Router through the SDKs, API, or Supported Libraries, reference it by the string <workspacename>@orq/<model-id>.

Example: acme@orq/my-auto-router

​Use Cases

​How It Works

​Set Up the Auto Router

​Profiles

​Recommended model pairs

​Use the Auto Router

​Reference in code

Use Cases

How It Works

Set Up the Auto Router

Profiles

Recommended model pairs

Use the Auto Router

Reference in code