Skip to main content

AI Router

Route your LLM calls through the AI Router with a single base URL change. Zero vendor lock-in: always run on the best model at the lowest cost for your use case.

Observability

Attach the native Orq callback handler to your LangGraph to capture traces for every LLM call, graph node, tool use, and retrieval.

AI Router

Overview

LangChain is a framework for building LLM-powered applications through composable chains, agents, and integrations with external data sources. By connecting LangChain to Orq.ai’s AI Router, you access 300+ models through a single base URL change.

Key Benefits

Orq.ai’s AI Router enhances your LangChain applications with:

Complete Observability

Track every chain step, tool use, and LLM call with detailed traces

Built-in Reliability

Automatic fallbacks, retries, and load balancing for production resilience

Cost Optimization

Real-time cost tracking and spend management across all your AI operations

Multi-Provider Access

Access 300+ LLMs and 20+ providers through a single, unified integration

Prerequisites

Before integrating LangChain with Orq.ai, ensure you have:
  • An Orq.ai account and API Key
  • Python 3.8 or higher
To set up your API key, see API keys & Endpoints.

Installation

pip install langchain langchain-openai

Configuration

Configure LangChain to use Orq.ai’s AI Router via ChatOpenAI with a custom base_url:
Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)
base_url: https://api.orq.ai/v3/router

Basic Example

Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

result = llm.invoke("Explain quantum computing in simple terms.")
print(result.content)

Chains

Build composable chains using LangChain’s pipe operator:
Python
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])

chain = prompt | llm
result = chain.invoke({"input": "Tell me a joke about programming."})
print(result.content)

Streaming

Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

for chunk in llm.stream("Write a short poem about the ocean."):
    print(chunk.content, end="", flush=True)
print()

Model Selection

With Orq.ai, you can use any supported model from 20+ providers:
Python
from langchain_openai import ChatOpenAI
import os

# Use Claude
claude = ChatOpenAI(
    model="anthropic/claude-sonnet-4-5",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

# Use Gemini
gemini = ChatOpenAI(
    model="gemini-2.5-flash",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

# Use Groq
groq = ChatOpenAI(
    model="groq/llama-3.3-70b-versatile",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v3/router",
)

Observability

orq_ai_sdk.langchain provides a global setup() function that automatically instruments all LangChain and LangGraph components. Call it once at the top of your application and every LLM call, graph node, tool execution, and retrieval is traced automatically, no callback wiring needed.

Zero configuration

One setup() call and tracing is live, no callbacks, no OpenTelemetry exporters, no extra wiring.

Full graph visibility

Traces preserve the parent-child structure of your LangGraph so you see exactly which node triggered each LLM call or tool use.

Token usage and costs

Input and output token counts are captured on every LLM call and synced to Orq.ai for cost tracking.

Retrieval tracking

Retrieval events include the query and all returned documents, making RAG pipelines fully inspectable.

Installation

pip install orq-ai-sdk langchain-core langchain-openai langgraph
orq-ai-sdk is the Orq.ai Python SDK. See the repository for the full reference and changelog.

Environment Variables

Set your API keys before running your application:
export ORQ_API_KEY="your-orq-api-key"
export OPENAI_API_KEY="your-openai-api-key" # required because the examples call OpenAI models directly
Or set them in code:
Python
import os
os.environ["ORQ_API_KEY"] = "your-orq-api-key"
os.environ["OPENAI_API_KEY"] = "your-openai-api-key" # required because the examples call OpenAI models directly

Basic Example

setup() must be called before importing or using any LangChain components. It globally instruments LangChain so that all subsequent graphs and chains are traced automatically.
Python
from orq_ai_sdk.langchain import setup

setup()

from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)

def chatbot(state: State):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

graph = graph_builder.compile()

result = graph.invoke({"messages": [{"role": "user", "content": "Hello!"}]})
print(result["messages"][-1].content)

Async Example

Python
from orq_ai_sdk.langchain import setup

setup()

import asyncio
from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)

def chatbot(state: State):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

graph = graph_builder.compile()

async def main():
    result = await graph.ainvoke({"messages": [{"role": "user", "content": "Hello!"}]})
    print(result["messages"][-1].content)

asyncio.run(main())
Use graph.ainvoke() instead of graph.invoke() for async execution. The setup() instrumentation works with both sync and async invocations.
OrqLangchainCallback is still available and fully backward compatible. Existing code using the callback handler will continue to work. However, the recommended approach is to use the global setup() function for simpler integration and automatic instrumentation.

Viewing Traces

Traces appear in the Orq.ai Studio under the Traces tab. Each run is captured as a tree reflecting your graph structure: top-level chain spans for each node, with LLM calls, tool executions, and retrievals nested underneath.
LangChain trace in the AI Studio

What Gets Traced

EventDetails captured
Graph nodes (chains)Node name, inputs, outputs, duration
LLM callsMessages, model, token usage, finish reason
Tool executionsTool name, input, output, duration
RetrievalsQuery, returned documents
Agent actionsAction taken, finish output