Skip to main content

What is AI Gateway?

AI Gateway is a standalone product for teams and developers who need a production-grade LLM routing layer. Point any OpenAI-compatible client at the AI Gateway endpoint and immediately gain:

Model routing

Route requests to 300+ models across 20+ providers through a single unified endpoint. Switch models without touching application code.

Reliability

Automatic retries, fallbacks, load balancing, and timeouts. Traffic keeps flowing when a provider goes down.

Cost and activity tracking

Every request is logged in the Overview page with model, latency, token counts, and cost. No extra instrumentation needed.

Who is it for?

AI Gateway is designed for:
  • Developers and technical teams who want a drop-in routing layer on top of their existing LLM calls
  • Companies focused on routing and cost control who do not need prompt management, agent orchestration, or experiment tooling
  • Teams already on a framework (LangChain, Vercel AI, OpenAI Agents SDK, etc.) who want provider flexibility with a single base URL change

Get started

Quick Start

Connect a provider, get an API key, and route the first request in under five minutes.

Supported Models

Browse the full model catalog across OpenAI, Anthropic, Google, AWS, Azure, and more.

Key capabilities

Auto Router

Let AI Gateway pick the best model for each request based on quality, cost, and latency targets.

Multimodal

Send images, PDFs, audio, and files through the same unified endpoint. No API changes needed.

Policies

Set workspace-level rules that apply to every request: model allow/deny lists, cost caps, and routing overrides.

Guardrails

Attach LLM-as-a-judge or Python guardrails to guardrail rules for input and output filtering.

Fallbacks and retries

Define ordered fallback chains so requests automatically retry on a backup model.

Identity tracking

Attribute requests to a specific user or tenant for per-identity cost and usage visibility.

Load balancing

Distribute traffic across multiple provider configurations or API keys to stay within rate limits.

How it works

Send any OpenAI-compatible request to the AI Gateway endpoint using a provider-prefixed model ID (e.g. openai/gpt-4o). The gateway resolves the provider, applies policies and guardrails, routes to the model, and logs the result in Activity. See the Quick Start for a step-by-step walkthrough.