TL;DR:
- Connect Orq.ai to LLM providers
- Use cURL streaming for real-time model responses.
- Add a knowledge base to enhance contextual understanding.
- Build a simple customer support agent powered by connected models and your data.
AI Gateway is a single unified API endpoint that lets you seamlessly route and manage requests across multiple AI model providers (e.g., OpenAI, Anthropic, Google, AWS). This functionality comes in handy, when you want to avoid dependency on a single provider and automatically switch between providers in case of an outage. API gateway gives you freedom from a vendor lock-in and ensures that you can scale reliably when the usage surges.
Getting started with AI Gateway
To get started you need to decide which provider you want to connect to. This is a sample OpenAI integration:- Navigate to Integrations
- Select OpenAI
- Click on View integration




- Workspace settings
- API Keys
- Copy your key

$ORQ_API_KEY with your API key:
Hello! This is a response. How can I assist you today? reply back from your API call
Trouble shooting common errors
Streaming
Streaming is …
"stream": true, the API uses a Server-Sent Events (SSE) connection, an open HTTP connection that continuously sends small packets of data.
Retries & fallbacks
Fallbacks are …
Caching
Orq.ai supports response caching to reduce latency and API usage.- type: “exact_match” → caches identical requests and reuses responses.
- ttl: 1800 → cache entries expire after 30 minutes (1800 seconds).
- Faster responses for repeated questions.
- Reduced API calls → lower cost.
- Example: A repeated FAQ query will return instantly from cache instead of hitting the model.
Adding a knowledge base
You can ground the conversation in domain-specific knowledge by linking knowledge_bases.Contact & Thread Tracking
Orq.ai allows tracking users and support sessions using contact and thread objects.Dynamic inputs
{{company_name}}, {{customer_tier}}, {{use_case}} are automatically replaced at runtime.
Prompts are personalized for each user/session without rewriting messages manually.
Building a Reliable Customer Support Agent
Imagine you’re creating a customer support agent for your company, Orq AI, which helps enterprise customers integrate APIs. You want it to be: Reliable — automatically retry or fallback if a model fails. Contextually aware — grounded in internal documentation and examples. Fast and cost-efficient — using caching for repeated queries. Traceable — track conversations per user and session. Personalized — dynamic prompts based on user type and project.- Dynamic Inputs / Prompt Templating
- Placeholders like {{company_name}}, {{customer_tier}}, and {{use_case}}are automatically replaced using the orq.inputs values.
- Effect: Each customer gets a personalized, context-aware response without rewriting prompts.
- If gpt-4o fails or is rate-limited (429, 500 series), Orq.ai retries up to 3 times.
- If retries fail, it automatically falls back to Anthropic Claude or GPT-4o-mini.
- Effect: The agent remains highly reliable and doesn’t leave customers waiting.
- Repeated queries with the same input return instantly from the cache (ttl: 1800s).
- Effect: Reduces latency and API usage, saving costs and improving responsiveness.
- The agent pulls relevant documents from internal KBs like api-documentation or integration-examples.
- Effect: Responses are grounded in your company’s content, making them accurate and trustworthy.
- Each session is linked to a contact (user) and a thread (conversation cluster).
- Effect: Enables session observability, analytics, and organized support tracking for enterprise customers.