Tracing
AI workflows can feel like a black box—when something goes wrong, it’s hard to know why. Tracing changes that by giving you full visibility into every step of your workflow. Instead of guessing why an LLM output is wrong, you can quickly check every step in the workflow—saving time and reducing frustration.
Python Evaluators
When building AI features, ensuring high-quality and reliable outputs is crucial. Orq.ai allows you to implement custom evaluators in Python, giving you full control over how AI-generated content is assessed and validated.
Move entities
Previously, once an entity was created, it was locked in place—you couldn’t move it to another project or directory. Now, you finally can.
GPT 4.5
OpenAI has unveiled GPT-4.5, its latest and most advanced AI language model to date. Building upon the foundation of GPT-4o, GPT-4.5 offers enhanced pattern recognition, deeper world knowledge, and a more refined conversational experience. This release aims to provide users with a more intuitive and reliable AI assistant for a variety of applications.
Claude 3.7 Sonnet
Claude 3.7 Sonnet is Anthropic’s most intelligent AI model yet, introducing hybrid reasoning, fine-tuned for business use cases, and an extended 200K context window.
Increased fallbacks
We’ve enhanced our fallback and retry system to provide even greater reliability and flexibility in production use cases. Previously, Orq.ai allowed users to define a single fallback model if the primary model failed to generate a satisfactory output. Now, we’ve increased the number of fallback models to five, giving users even more control over model orchestration without additional coding.
Importing prompts
We’ve introduced a new feature that makes it easier than ever to reuse your prompts within Orq.ai. With the import functionality, you can seamlessly integrate your already created, version-controlled prompts into your Playgrounds, Experiments, and Deployments.
Datasets v2
We've made a significant update to how datasets work in Orq.ai with the release of Datasets v2. This update merges variable collections into datasets, streamlining the structure and eliminating confusion between the two concepts.
Experiments v2 - Evaluate your LLM Config
We’re excited to introduce Experiments V2, a major upgrade to our Experiments module that makes testing, evaluating, and benchmarking models and prompts more intuitive and flexible than ever before.
DeepSeek Models Now Available: 67B Chat, R1, and V3
We are excited to announce the integration of DeepSeek’s latest AI models—67B Chat, R1, and V3—into our platform.