Skip to main content
Release 3.12
v3.12.0
Enhanced observability, smarter experimentation, and better cost controls for AI applications.

πŸ” Enhanced Observability & Debugging

Human-in-the-Loop Reviews

  • Collect structured feedback on AI outputs with customizable Human Review Sets per trace type
  • Directly add spans to datasets for continuous improvement
  • Track contact IDs and thread context across chat completions

Faster Root Cause Analysis

  • View retrieval configurations directly in span properties
  • See evaluator names on spans for quick performance assessment
  • Expanded OpenTelemetry support for more frameworks

Cost Optimization

  • Optional response caching to reduce latency and API costs
  • Fixed cost aggregation for image operations and Azure OpenAI
  • More accurate token and billing tracking

πŸ§ͺ Streamlined Experimentation

Improved Experiment Management

  • Search across experiment entries
  • Protection against accidental re-runs
  • Persistent column settings and better cancellation handling
  • Enhanced UI with clearer active states and progress indicators

πŸš€ AI Gateway Enhancements

Advanced Request Handling

  • Automatic retries and fallback models for improved reliability
  • Thread and contact tracking for conversation continuity
  • Specify prompt versions directly in LLM calls
  • Improved SSE streaming performance

πŸ’° Budget Controls

Workspace-Level Cost Management

  • Set and monitor budgets at workspace and contact levels
  • New Budgets API for programmatic cost control
  • Automated alerts and spending limits

🎯 Platform Improvements

Model Management

  • New image generation models and providers
  • Intelligent model filtering based on capabilities
  • Improved cost extraction and model selection UI

Developer Experience

  • Better API parameter documentation for Knowledge Base
  • Unsaved changes protection across Teams and Contacts
  • Improved error handling and retry logic throughout
AI Gateway - Superpower your LLM requests
v3.12.0
Today, we are bringing all the power Deployments to our AI Gateway. Now, teams will be able to run their AI workloads in a reliable and battle-tested AI Gateway.Features supported via the Gateway:
  • Fallbacks
  • Retry
  • Contact Tracking
  • Thread Management
  • Cache
  • Knowledge Bases
cURL
curl --location 'https://api.orq.ai/v2/router/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $ORQ_API_KEY' \
--data-raw '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful customer support agent for {{company_name}}. Use available knowledge to assist {{customer_tier}} customers."
},
{
"role": "user",
"content": "I need help with API integration for my {{use_case}} project"
}
],
"orq": {
"retry": {
"count": 3,
"on_codes": [429, 500, 502, 503, 504]
},
"fallbacks": [
{
"model": "anthropic/claude-3-5-sonnet-20241022"
},
{
"model": "openai/gpt-4o-mini"
}
],
"cache": {
"type": "exact_match",
"ttl": 1800
},
"knowledge_bases": [
{
"knowledge_id": "api-documentation",
"top_k": 5,
"threshold": 0.75
},
{
"knowledge_id": "integration-examples",
"top_k": 3,
"threshold": 0.8
}
],
"contact": {
"id": "enterprise_customer_001",
"display_name": "Enterprise User",
"email": "user@enterprise.com"
},
"thread": {
"id": "support_session_001",
"tags": ["api-integration", "enterprise", "technical-support"]
},
"inputs": {
"company_name": "Orq AI",
"customer_tier": "Enterprise",
"use_case": "e-commerce platform"
}
}
}'
Start building today, to learn more, see the AI Gateway.
OpenTelemetry - LangChain, LangGraph, OpenAI Agents and more
v3.12.0
Monitor and debug your AI pipelines with production-grade observability:
  • Complete request tracing across LLM calls, chain executions, and agent workflows
  • Automatic instrumentation for latency, token usage, and error tracking
  • Zero-code instrumentation for supported frameworks
Key benefits:
  • Identify bottlenecks in complex multi-step AI workflows
  • Track costs with token-level granularity
  • Debug agent reasoning paths and tool usage in production
  • Correlate AI operations with upstream/downstream services

Supported frameworks

  • Agno
  • AutoGen
  • BeeAI
  • CrewAI
  • DSPy
  • Google ADK
  • Haystack
  • Instructor
  • LangChain / LangGraph
  • LiteLLM
  • LiveKit
  • LlamaIndex
  • Mastra
  • OpenAI Agents
  • Pydantic AI
  • Semantic Kernel
  • SmolAgents
  • Vercel AI SDK
AI Gateway - Vercel AI SDK
v3.12.0
A Vercel AI SDK provider for Orq AI platform that enables seamless integration of AI models with the Vercel AI SDK ecosystem.🎯 Features
  • Full Vercel AI SDK Compatibility: Works with all Vercel AI SDK functions (generateText, streamText, embed, etc.)
  • Multiple Model Types: Support for chat, completion, embedding, and image generation models
  • Streaming Support: Real-time streaming responses for a better user experience
  • Type-safe: Fully written in TypeScript with comprehensive type definitions
  • Orq Platform Integration: Direct access to Orq AI’s model routing and optimization

Installation

npm i ai @orq-ai/vercel-provider

Getting Started

Node.js
import { createOrqAiProvider } from "@orq-ai/vercel-provider";
import { generateText } from "ai";

const orq = createOrqAiProvider({
apiKey: process.env.ORQ_API_KEY,
});

const { text } = await generateText({
model: orq("gpt-4"),
messages: [{ role: "user", content: "Hello!" }],
});
Find more info in the Github Repository
Human Reviews in Traces
v3.12.0
Report feedback directly on traces using Human Reviews and Human Review Sets. Configure custom review sets per trace type, or tailor human reviews by application name or trace name for granular feedback collection.
Custom instrumentation with @traced
v3.12.0
We’ve introduced the @traced decorator, a powerful new way to capture function-level traces directly in your Python code.
  • Automatically logs function inputs, outputs, and metadata
  • Supports nested spans and custom span types (LLM, agent, tool, etc.)
  • Works seamlessly with the Orq SDK initialization (no separate init required)
  • Integrates with OpenTelemetry for end-to-end distributed tracing
This makes it easier than ever to debug, monitor, and observe your applications in real time.
Python
import time
from orq_ai_sdk import traced

@traced
def process_user(user_id: str, action: str) -> dict:
	# Simulate some processing
	time.sleep(0.1)

	result = {
		"user_id": user_id,
		"action": action,
		"status": "completed",
		"timestamp": time.time()
	}
	return result
To get started, install the Orq.ai SDK for Python
pip install orq-ai-sdk
To learn more, see our Observability Frameworks.
Cerebas and Jina support in the Model Garden
v3.12.0
Access high-performance AI models through unified interface:Cerebras: Ultra-fast LLM inference with sub-second response times for Llama and other open models Jina: State-of-the-art multilingual embeddings (jina-embeddings-v3) and reranking models for RAG pipelinesAPI Key Flexibility:Bring Your Own Key (BYOK): Use your existing Cerebras or Jina API credentials Managed Access: All workspaces on paid plans get automatic access through Orq.ai’s pooled API keysβ€”no separate vendor accounts needed
ByteDance Image Models
v3.12.0
Seedream-3.0-T2I-250415A state-of-the-art text-to-image model that generates high-resolution, photorealistic images from prompts. Supports bilingual input.SeedEdit-3.0-I2I-250628An advanced image-to-image model that lets you apply precise edits using both images and text prompts.

Examples

Generate image with Seedream
cURL
curl https://api.orq.ai/v2/proxy/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ORQ_API_KEY" \
-d '{
"model": "bytedance/seedream-3-0-t2i-250415",
"prompt": "A beautiful sunset over mountains",
"n": 1
}'
OpenAI (Python)
from openai import OpenAI
import os

client = OpenAI(
base_url="https://api.orq.ai/v2/proxy",
api_key=os.getenv("ORQ_API_KEY"),
)

response = client.images.generate(
model="bytedance/seedream-3-0-t2i-250415",
prompt="A beautiful sunset over mountains",
n=1
)

print(response.data[0].url)
Node.js
import OpenAI from "openai";

const openai = new OpenAI({
baseURL: 'https://api.orq.ai/v2/proxy',
apiKey: process.env.ORQ_API_KEY,
});

async function main() {
const response = await openai.images.generate({
model: "bytedance/seedream-3-0-t2i-250415",
prompt: "A beautiful sunset over mountains",
n: 1
});

console.log(response.data[0].url);
}

main();
Edit images with SeedEdit
cURL
curl https://api.orq.ai/v2/proxy/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ORQ_API_KEY" \
-d '{
"model": "bytedance/seededit-3-0-i2i-250628",
"prompt": "A beautiful sunset over mountains",
"n": 1
}'
OpenAI (Python)
from openai import OpenAI
import os

client = OpenAI(
base_url="https://api.orq.ai/v2/proxy",
api_key=os.getenv("ORQ_API_KEY"),
)

response = client.images.generate(
model="bytedance/seededit-3-0-i2i-250628",
prompt="A beautiful sunset over mountains",
n=1
)

print(response.data[0].url)
Node.js
import OpenAI from "openai";

const openai = new OpenAI({
baseURL: 'https://api.orq.ai/v2/proxy',
apiKey: process.env.ORQ_API_KEY,
});

async function main() {
const response = await openai.images.generate({
model: "bytedance/seededit-3-0-i2i-250628",
prompt: "A beautiful sunset over mountains",
n: 1
});

console.log(response.data[0].url);
}

main();
Both models runs in European datacenters in Germany
Workspace Budget
v3.12.0
Added support for Workspace Budget.Now it’s possible to set a Budget for your workspace to control your AI spend for your organization