Skip to main content

AI Router

Route your LLM calls through the AI Router with a single base URL change. Zero vendor lock-in: always run on the best model at the lowest cost for your use case.

Observability

Attach the native Orq callback handler to your LangGraph to capture traces for every LLM call, graph node, tool use, and retrieval.

AI Router

Overview

LangChain is a framework for building LLM-powered applications through composable chains, agents, and integrations with external data sources. By connecting LangChain to Orq.ai’s AI Router, you access 300+ models through a single base URL change.

Key Benefits

Orq.ai’s AI Router enhances your LangChain applications with:

Complete Observability

Track every chain step, tool use, and LLM call with detailed traces

Built-in Reliability

Automatic fallbacks, retries, and load balancing for production resilience

Cost Optimization

Real-time cost tracking and spend management across all your AI operations

Multi-Provider Access

Access 300+ LLMs and 20+ providers through a single, unified integration

Prerequisites

Before integrating LangChain with Orq.ai, ensure you have:
  • An Orq.ai account and API Key
  • Python 3.8 or higher
To set up your API key, see API keys & Endpoints.

Installation

pip install langchain langchain-openai

Configuration

Configure LangChain to use Orq.ai’s AI Router via ChatOpenAI with a custom base_url:
Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)
base_url: https://api.orq.ai/v2/router

Basic Example

Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

result = llm.invoke("Explain quantum computing in simple terms.")
print(result.content)

Chains

Build composable chains using LangChain’s pipe operator:
Python
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])

chain = prompt | llm
result = chain.invoke({"input": "Tell me a joke about programming."})
print(result.content)

Streaming

Python
from langchain_openai import ChatOpenAI
import os

llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

for chunk in llm.stream("Write a short poem about the ocean."):
    print(chunk.content, end="", flush=True)
print()

Model Selection

With Orq.ai, you can use any supported model from 20+ providers:
Python
from langchain_openai import ChatOpenAI
import os

# Use Claude
claude = ChatOpenAI(
    model="anthropic/claude-sonnet-4-5",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

# Use Gemini
gemini = ChatOpenAI(
    model="gemini-2.5-flash",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

# Use Groq
groq = ChatOpenAI(
    model="groq/llama-3.3-70b-versatile",
    api_key=os.getenv("ORQ_API_KEY"),
    base_url="https://api.orq.ai/v2/router",
)

Observability

OrqLangchainCallback is a native LangChain callback handler from the Orq.ai Python SDK. Attach it once to your compiled graph and it automatically captures the full execution hierarchy across every run: graph nodes, LLM calls, tool executions, and retrievals.

Zero configuration

No OpenTelemetry setup, no exporters, no environment variables. Two lines of code and tracing is live.

Full graph visibility

Traces preserve the parent-child structure of your LangGraph so you see exactly which node triggered each LLM call or tool use.

Token usage and costs

Input and output token counts are captured on every LLM call and synced to Orq.ai for cost tracking.

Retrieval tracking

Retrieval events include the query and all returned documents, making RAG pipelines fully inspectable.

Installation

pip install orq-ai-sdk langchain-core langchain-openai langgraph
orq-ai-sdk is the Orq.ai Python SDK. See the repository for the full reference and changelog.

Basic Example

Python
import os
from typing import Annotated
from typing_extensions import TypedDict
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from orq_ai_sdk.langchain import OrqLangchainCallback

class State(TypedDict):
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)
llm = ChatOpenAI(model="gpt-4o", temperature=0.2)  # requires OPENAI_API_KEY

def chatbot(state: State):
    return {"messages": [llm.invoke(state["messages"])]}

graph_builder.add_node("chatbot", chatbot)
graph_builder.add_edge(START, "chatbot")
graph_builder.add_edge("chatbot", END)

orq_handler = OrqLangchainCallback(
    api_key=os.getenv("ORQ_API_KEY"),
)

graph = graph_builder.compile().with_config({"callbacks": [orq_handler]})

result = graph.invoke({"messages": [{"role": "user", "content": "Hello!"}]})
print(result["messages"][-1].content)
The .with_config({"callbacks": [orq_handler]}) call bakes the handler into the compiled graph so all subsequent invocations are traced automatically without passing callbacks manually each time.

What Gets Traced

EventDetails captured
Graph nodes (chains)Node name, inputs, outputs, duration
LLM callsMessages, model, token usage, finish reason
Tool executionsTool name, input, output, duration
RetrievalsQuery, returned documents
Agent actionsAction taken, finish output

Viewing Traces

Traces appear in the Orq.ai Studio under the Traces tab. Each run is captured as a tree reflecting your graph structure: top-level chain spans for each node, with LLM calls, tool executions, and retrievals nested underneath.
LangChain trace in the Orq.ai Studio