Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Overview

Visual Studio Code supports MCP servers through the GitHub Copilot extension. With the Orq MCP integration, access the Orq.ai workspace directly from Copilot Chat and the editor environment.

Prerequisites

Visual Studio Code 1.99 or later
GitHub Copilot extension installed and active
Active Orq.ai account
Orq.ai API key

Installation

Add MCP Server

Open the Command Palette with ⌘⇧P (macOS) or Ctrl+Shift+P (Windows/Linux)
Search for and select MCP: Add Server
Select HTTP (HTTP or Server-Sent Events) as the server type
Enter the server URL: https://my.orq.ai/v2/mcp
Name the server orq when prompted

VS Code will create or update a .vscode/mcp.json file in the workspace root. Open it and replace the contents with:

{
  "inputs": [
    {
      "type": "promptString",
      "id": "orq-api-key",
      "description": "Orq.ai API Key",
      "password": true
    }
  ],
  "servers": {
    "orq": {
      "url": "https://my.orq.ai/v2/mcp",
      "type": "http",
      "headers": {
        "Authorization": "Bearer ${input:orq-api-key}"
      }
    }
  }
}

When connecting to the server for the first time, VS Code will prompt for the Orq.ai API key and store it securely in the OS secret store. The key is never written to disk.

Verify Installation

Open Copilot Chat with ⌃⌘I (macOS) or Ctrl+Alt+I (Windows/Linux)
Ask:

Can you list the available models from Orq?

If the integration is working, a list of AI models from the Orq.ai workspace will appear. To check server status at any time, open the Command Palette and run MCP: List Servers.

Available Commands

Use natural language to ask Copilot to perform these operations:

Agents

create an agent with custom instructions and tools
get agent configuration for [agent-key]
update agent [agent-key] with new instructions or model
configure agent with evaluators and guardrails

Analytics

get analytics overview for my workspace
show me workspace metrics for the last 7 days
query analytics filtered by deployment ID

Datasets

create a dataset called "customer-queries"
list all datapoints in dataset [dataset-key]
add datapoints to dataset [dataset-key]
update datapoint [datapoint-id]
delete specific datapoints in dataset [dataset-key]
delete dataset [dataset-key]

Experiments

create an experiment from dataset [dataset-key]
list all experiment runs
export experiment run [run-id] as CSV
run experiment and auto-evaluate results

Evaluators

get evaluator configuration for [evaluator-key]
create an LLM-as-a-Judge evaluator for tone
create a Python evaluator to check response length
add evaluator to experiment [experiment-key]
update evaluator [evaluator-key] with a new prompt
update Python evaluator [evaluator-key] with revised code

Traces

list traces from the last 24 hours
show me traces with errors
get span details for trace [trace-id]
find the slowest traces from today
show all traces for thread [thread-id]

Models

list all available chat models
list all available embedding models

Registry

list registry keys for filtering traces
list top values for [attribute-key]

search for datasets named "customer"
find experiments in project [project-id]
list directories in project [project-id]

Documentation

search the Orq.ai docs for [topic]

Managing Entities

delete agent [agent-key]
delete experiment [experiment-key]
delete evaluator [evaluator-key]
delete prompt [prompt-key]
delete knowledge base [knowledge-base-key]

Use delete_dataset to delete a dataset along with all its datapoints.

Usage Examples

Chat Panel Commands

Use natural language in Copilot Chat:

create a dataset called "API Tests" with 20 synthetic API request examples

The assistant will:

Generate 20 synthetic API request examples
Use create_dataset to create a new dataset named “API Tests”
Use create_datapoints to add all examples to the dataset
Confirm creation with the dataset ID

show me errors from the last 24 hours

The assistant will:

Calculate the time range for the last 24 hours
Use list_traces with error status filter
Display trace IDs, error messages, and timestamps
Provide a summary of error types and frequency

create an experiment comparing GPT-5.2 and Claude Sonnet 4.6 using the "user-queries" dataset

The assistant will:

Search for the “user-queries” dataset using search_entities
Use create_experiment with two configurations (one for GPT-5.2, one for Claude Sonnet 4.6)
Run the experiment against all datapoints in the dataset
Display the experiment ID and status

Inline Code Integration

Copilot can use Orq MCP context while coding:

Select code in the editor
Open Copilot Chat (⌃⌘I / Ctrl+Alt+I)
Ask about traces or analytics related to the code

Example:

# Select this function
def process_user_query(query):
    response = orq.deployments.invoke(...)
    return response

Then ask:

how has this endpoint performed over the last week?

The assistant will:

Extract the deployment key from the selected code
Use query_analytics with the deployment filter
Set time range to the last 7 days
Analyze performance metrics (requests, errors, latency, tokens)
Provide insights and recommendations based on the data

Dataset Creation from Code

[
  {"input": "What is AI?", "expected_output": "Artificial Intelligence..."},
  {"input": "Explain ML", "expected_output": "Machine Learning..."}
]

create a dataset from the JSON array above and add it to my workspace

The assistant will:

Parse the JSON array from the editor
Use create_dataset to create a new dataset with an auto-generated name
Use create_datapoints to add each entry as a datapoint
Confirm the dataset ID and number of datapoints added

Experiment Analysis

create an experiment using "customer-feedback" dataset, configure it with two prompts: one focused on empathy and one focused on brevity, then run it and summarize the results

The assistant will:

Search for the “customer-feedback” dataset using search_entities
Use create_experiment with two prompt variants (empathy-focused and brevity-focused) and auto-run enabled
Execute both variants against all datapoints automatically via the auto-run option
Use get_experiment_run to retrieve evaluation metrics
Compare the two variants and provide a summary of which performed better

Performance Investigation

find the 5 slowest traces from today and show me their span details

The assistant will:

Use list_traces with today’s date filter
Sort traces by duration (descending)
Retrieve the top 5 slowest traces
Use list_spans to fetch span information for each trace
Display latency breakdowns, bottlenecks, and performance insights

Synthetic Data Generation

generate 50 realistic customer support questions about a SaaS product and create a dataset called "Support Training Data"

The assistant will:

Generate 50 synthetic customer support questions and expected responses
Use create_dataset to create a dataset named “Support Training Data”
Use create_datapoints to add all 50 examples to the dataset
Confirm creation with the dataset ID and sample of generated questions

Troubleshooting

Orq MCP Not Responding

Open the Command Palette and run MCP: List Servers
Select the orq server and choose Show Output to view logs
Restart VS Code and reconnect. VS Code will prompt for the API key again on first connection

Authentication Errors

Confirm the API key is valid in Workspace Settings → API Keys
Ensure the API key has the necessary permissions
Try regenerating the API key, then restart VS Code so the input prompt appears again

Tools Not Available

Run MCP: List Servers from the Command Palette and confirm the orq server status is active
Ensure GitHub Copilot is signed in and active
Check network connectivity
Review the server output log for error details

Documentation Index

​Overview

​Prerequisites

​Installation

​Add MCP Server

​Verify Installation

​Available Commands

​Usage Examples

​Chat Panel Commands

​Inline Code Integration

​Dataset Creation from Code

​Experiment Analysis

​Performance Investigation

​Synthetic Data Generation

​Troubleshooting

Overview

Prerequisites

Installation

Add MCP Server

Verify Installation

Available Commands

Usage Examples

Chat Panel Commands

Inline Code Integration

Dataset Creation from Code

Experiment Analysis

Performance Investigation

Synthetic Data Generation

Troubleshooting