Skip to main content

Overview

Orq Skills are pre-built, reusable workflows from the orq-ai/orq-skills repository. They come in two forms:
  • Skills: multi-step workflows that require reasoning, such as building an agent, running an experiment, or analyzing trace failures.
  • Commands: quick slash-command actions for immediate results, such as listing traces or showing analytics.
Both are built on the Agent Skills standard format, which means they work with any compatible assistant: Claude Code, Cursor, Gemini CLI, and others. Each skill encodes best practices from prompt engineering, agent design, evaluation methodology, and experimentation into a repeatable, triggered workflow.

Prerequisites

  • An active orq.ai account
  • An API key
  • The Orq MCP server connected to your assistant (see MCP Quickstart)

Installation

Choose the option that matches your assistant:
# Installs skills, commands, agents, and the MCP server in one step
claude plugin marketplace add orq-ai/claude-plugins
claude plugin install orq-skills@orq-claude-plugin
Use one path only. The Claude Code plugin install includes the MCP server — running both paths will install the MCP twice. Commands (/orq:quickstart, /orq:workspace, and others) and agents are only available with the Claude Code plugin.

Verify

Claude Code: Run the interactive onboarding command to confirm everything is working:
/orq:quickstart
Cursor, Gemini CLI, and others: Describe a task (e.g., “list my Orq.ai agents”) and confirm the skill responds correctly.

Commands

Quick-action slash commands available in Claude Code. Use /orq:<command> to trigger them.
CommandDescriptionUsage
quickstartInteractive onboarding: credentials, MCP setup, skills tour/orq:quickstart
workspaceWorkspace overview: Agents, Deployments, Prompts, Datasets, Experiments/orq:workspace [section]
tracesQuery and summarize Traces with filters/orq:traces [--deployment name] [--status error] [--last 24h]
modelsList available AI models by provider/orq:models [search-term]
analyticsUsage Analytics: requests, cost, tokens, errors/orq:analytics [--last 24h] [--group-by model]

Skills

Skills are triggered by describing what you need. The assistant picks the right skill automatically.
SkillDescriptionDocs
build-agentDesign, create, and configure an Orq.ai Agent with tools, instructions, Knowledge Bases, and MemorySKILL.md
build-evaluatorCreate validated LLM-as-a-Judge Evaluators following evaluation best practicesSKILL.md
analyze-trace-failuresRead production Traces, identify what is failing, build failure taxonomies, and categorize issuesSKILL.md
run-experimentCreate and run Orq.ai Experiments: compare configurations with specialized agent, conversation, and RAG evaluationSKILL.md
generate-synthetic-datasetGenerate and curate evaluation Datasets: structured generation, quick from description, expansion, and dataset maintenanceSKILL.md
optimize-promptAnalyze and optimize system Prompts using a structured prompting guidelines frameworkSKILL.md
setup-observabilityInstrument LLM applications with orq.ai tracing. Covers AI Router (zero-code traces) and OpenTelemetry/OpenInference. Guides from framework detection through baseline verification to trace enrichmentSKILL.md
compare-agentsRun cross-framework agent comparisons — compare any combination of orq.ai, LangGraph, CrewAI, or OpenAI Agents SDK agents using evaluatorqSKILL.md

Example workflows

Instrument an existing app

"Add orq.ai tracing to my app"                 → setup-observability
/orq:traces --last 1h                           # Verify traces are flowing
"Analyze these failures"                        → analyze-trace-failures

Build a new agent

"I need a customer support agent"              → build-agent
"Create test cases for it"                     → generate-synthetic-dataset
"Build an evaluator for response accuracy"     → build-evaluator
"Run an experiment to get a baseline"          → run-experiment

Debug production issues

/orq:traces --status error --last 24h          # Find errors
"Analyze these failures"                       → analyze-trace-failures
"Fix the prompt based on the failure analysis" → optimize-prompt
"Re-run the experiment to verify the fix"      → run-experiment

Improve an existing agent

/orq:analytics --group-by deployment           # Spot high error rates
"Analyze traces for the checkout agent"        → analyze-trace-failures
"Build evaluators for the failure modes"       → build-evaluator
"Generate a dataset covering edge cases"       → generate-synthetic-dataset
"Run an experiment and compare"                → run-experiment
"Optimize the prompt based on results"         → optimize-prompt

Improve an existing prompt

"My prompt isn't performing well, help me improve it" → optimize-prompt
"Create test cases to compare before and after"       → generate-synthetic-dataset
"Build an evaluator for a specific dimension"         → build-evaluator
"Run an experiment: current vs optimized prompt"      → run-experiment
"Refine the prompt based on failure cases"            → optimize-prompt

Resources

orq-ai/orq-skills

Source repository for all skills, commands, and agents