Documentation Index Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
Use this file to discover all available pages before exploring further.
AI Router Route your LLM calls through the AI Router with a single base URL change. Zero vendor lock-in: always run on the best model at the lowest cost for your use case.
AI Router
Overview
Microsoft Semantic Kernel is an SDK that integrates Large Language Models (LLMs) with conventional programming languages. By connecting Semantic Kernel to Orq.ai’s AI Router, you transform experimental AI agents into production-ready systems with enterprise-grade capabilities.
Key Benefits
Orq.ai’s AI Router enhances your Semantic Kernel applications with:
Complete Observability Track every agent step, tool use, and interaction with detailed traces and analytics
Built-in Reliability Automatic fallbacks, retries, and load balancing for production resilience
Cost Optimization Real-time cost tracking and spend management across all your AI operations
Multi-Provider Access Access 300+ LLMs and 20+ providers through a single, unified integration
Prerequisites
Before integrating Semantic Kernel with Orq.ai, ensure you have:
An Orq.ai account and API Key
Python 3.8 or higher
Semantic Kernel SDK installed
Installation
Install Semantic Kernel and the OpenAI SDK:
pip install semantic-kernel openai
Configuration
Configure Semantic Kernel to use Orq.ai’s AI Router by creating an OpenAI client with a custom base URL:
from openai import AsyncOpenAI
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
import os
# Configure OpenAI client with Orq.ai AI Router
client = AsyncOpenAI(
api_key = os.getenv( "ORQ_API_KEY" ),
base_url = "https://api.orq.ai/v3/router"
)
# Create kernel
kernel = Kernel()
# Add chat completion service
chat_service = OpenAIChatCompletion(
ai_model_id = "gpt-4o" ,
async_client = client
)
kernel.add_service(chat_service)
base_url : https://api.orq.ai/v3/router
Basic Example
Here’s a complete example of using Semantic Kernel with Orq.ai:
from openai import AsyncOpenAI
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
OpenAIChatCompletion,
OpenAIChatPromptExecutionSettings
)
from semantic_kernel.contents import ChatHistory
import asyncio
import os
async def main ():
# Configure client with Orq.ai
client = AsyncOpenAI(
api_key = os.getenv( "ORQ_API_KEY" ),
base_url = "https://api.orq.ai/v3/router"
)
# Create kernel
kernel = Kernel()
# Add chat completion service
chat_service = OpenAIChatCompletion(
ai_model_id = "gpt-4o" ,
async_client = client
)
kernel.add_service(chat_service)
# Create execution settings
settings = OpenAIChatPromptExecutionSettings(
max_tokens = 2000 ,
temperature = 0.7
)
# Create chat history
history = ChatHistory()
history.add_user_message( "What is quantum computing?" )
# Get response
response = await chat_service.get_chat_message_content(
chat_history = history,
settings = settings,
kernel = kernel
)
print (response.content)
if __name__ == "__main__" :
asyncio.run(main())
Using Plugins (Functions)
Semantic Kernel’s power comes from combining LLMs with plugins. Here’s how to use them with Orq.ai:
from openai import AsyncOpenAI
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.functions import kernel_function
from semantic_kernel.contents import ChatHistory
import asyncio
import os
# Define a plugin
class WeatherPlugin :
@kernel_function (
name = "get_weather" ,
description = "Get the weather for a location"
)
def get_weather (self, location: str ) -> str :
"""Get weather for a location."""
return f "The weather in { location } is sunny and 72°F"
async def main ():
# Configure client
client = AsyncOpenAI(
api_key = os.getenv( "ORQ_API_KEY" ),
base_url = "https://api.orq.ai/v3/router"
)
# Create kernel
kernel = Kernel()
# Add chat completion service
chat_service = OpenAIChatCompletion(
ai_model_id = "gpt-4o" ,
async_client = client
)
kernel.add_service(chat_service)
# Add plugin
kernel.add_plugin(
WeatherPlugin(),
plugin_name = "WeatherPlugin"
)
# Create chat history
history = ChatHistory()
history.add_user_message( "What's the weather in San Francisco?" )
# Enable automatic function calling
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
execution_settings = OpenAIChatPromptExecutionSettings(
function_choice_behavior = FunctionChoiceBehavior.Auto()
)
# Get response with function calling
response = await chat_service.get_chat_message_content(
chat_history = history,
settings = execution_settings,
kernel = kernel
)
print (response.content)
if __name__ == "__main__" :
asyncio.run(main())
Model Selection
With Orq.ai, you can use any supported model from 20+ providers:
from openai import AsyncOpenAI
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
import os
# Configure client
client = AsyncOpenAI(
api_key = os.getenv( "ORQ_API_KEY" ),
base_url = "https://api.orq.ai/v3/router"
)
kernel = Kernel()
# Use Claude
claude_service = OpenAIChatCompletion(
ai_model_id = "claude-sonnet-4-5-20250929" ,
async_client = client,
service_id = "claude"
)
kernel.add_service(claude_service)
# Use Gemini
gemini_service = OpenAIChatCompletion(
ai_model_id = "gemini-2.5-flash" ,
async_client = client,
service_id = "gemini"
)
kernel.add_service(gemini_service)
# Use any other model
groq_service = OpenAIChatCompletion(
ai_model_id = "llama-3.3-70b-versatile" ,
async_client = client,
service_id = "groq"
)
kernel.add_service(groq_service)
Streaming Responses
Semantic Kernel supports streaming with Orq.ai:
from openai import AsyncOpenAI
from semantic_kernel import Kernel
from semantic_kernel.connectors.ai.open_ai import (
OpenAIChatCompletion,
OpenAIChatPromptExecutionSettings
)
from semantic_kernel.contents import ChatHistory
import asyncio
import os
async def main ():
client = AsyncOpenAI(
api_key = os.getenv( "ORQ_API_KEY" ),
base_url = "https://api.orq.ai/v3/router"
)
kernel = Kernel()
chat_service = OpenAIChatCompletion(
ai_model_id = "gpt-4o" ,
async_client = client
)
kernel.add_service(chat_service)
settings = OpenAIChatPromptExecutionSettings(
max_tokens = 2000 ,
temperature = 0.7
)
history = ChatHistory()
history.add_user_message( "Write a short story about AI" )
# Stream response
async for message in chat_service.get_streaming_chat_message_content(
chat_history = history,
settings = settings,
kernel = kernel
):
print (message.content, end = "" , flush = True )
if __name__ == "__main__" :
asyncio.run(main())
Observability & Monitoring
All Semantic Kernel interactions routed through Orq.ai are automatically tracked and available in the AI Studio :
Request Traces : View complete conversation flows and function calls
Plugin Usage : Monitor which plugins are being invoked and their success rates
Performance Metrics : Track latency, token usage, and completion rates
Cost Analysis : Understand spending patterns across models and providers
Visit your AI Studio to view real-time analytics and traces.
Evaluations & Experiments
Once your agents are running, use Evaluatorq to score outputs across a dataset and Experiments to compare configurations side-by-side.
Run Evaluations with Evaluatorq Run parallel evaluations across your agents and compare results.
Run Experiments via the API Compare agent configurations and view results in the AI Studio.