Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Datasets hold the test data that powers Experiments. Each dataset row contains up to three fields:

Inputs: Variables injected into the prompt at runtime, e.g. {{firstname}}.
Messages: The prompt template, structured with system, user, and assistant roles.
Expected Outputs: Reference responses evaluators compare against model outputs.

You don’t need all three fields in every dataset. A dataset with only inputs, or only messages, is valid.

Use Cases

Regression and pre-deployment testing

Run the same dataset through your prompts before and after a change to verify that updates haven’t degraded performance in any area.

Compare models and prompt variants

Use the same dataset across multiple models or prompt configurations in an Experiment to find the best combination of quality, cost, and latency.

Curated datasets for fine-tuning

Have domain experts review and correct model outputs, then save those verified input/output pairs as a curated dataset to use as fine-tuning reference data.

Synthetic data generation at scale

Use the Orq MCP to generate hundreds of realistic test cases programmatically and add them directly to a dataset without leaving your IDE.

Vision and image datasets

Build datasets with image messages for testing vision models. Supports JPEG, PNG, GIF, and WebP via the AI Studio or API.

Create a Dataset

AI Studio
API & SDK
MCP

Use the button on a Project folder and select Dataset. Enter a title to open the Table View.The table has three columns: Inputs, Messages, and Expected Outputs. Add as many rows as needed. Create a Dataset

Use the Create Dataset API. A unique display_name and a path (the Project folder) are required.

curl --request POST \
     --url https://api.orq.ai/v2/datasets \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY' \
     --header 'content-type: application/json' \
     --data '{
  "display_name": "MyDataset",
  "path": "Default"
}'

The response includes a _id field (the dataset ID) used in subsequent calls.

See the full Create Dataset API reference.

Create a dataset:

Create a dataset called "Support Training Data" in the Default project

The assistant uses create_dataset with the display name and path.

Generate a synthetic dataset:

Generate 50 realistic customer support questions about a SaaS product and create a dataset called "Support Training Data"

The assistant generates the entries, uses create_dataset to create the dataset, then uses create_datapoints to add all entries in bulk.

Find an existing dataset:

Find the "user-queries" dataset in my workspace

The assistant uses search_entities with type: "dataset" to locate it by name.

Add Datapoints

AI Studio
API & SDK
MCP

Manually: Click Add Row and fill in each cell.From CSV: Click Import and drag-and-drop a .csv file. Map each CSV column to a Dataset field (Inputs, Messages, Expected Outputs). Each row becomes a separate datapoint.

Use the Create Datapoints API. Send between 1 and 5,000 datapoints per request. Requests over 500 datapoints are automatically chunked.

curl --request POST \
     --url https://api.orq.ai/v2/datasets/DATASET_ID/datapoints \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY' \
     --header 'content-type: application/json' \
     --data '[
  {
    "inputs": {"country": "France"},
    "messages": [
      {"role": "user", "content": "Capital of {{country}}?"},
      {"role": "assistant", "content": "Paris"}
    ],
    "expected_output": "Paris"
  },
  {
    "inputs": {"country": "Germany"},
    "messages": [
      {"role": "user", "content": "Capital of {{country}}?"},
      {"role": "assistant", "content": "Berlin"}
    ],
    "expected_output": "Berlin"
  }
]'

Large batch example (1,000 datapoints)

Python

from orq_ai_sdk import Orq
import os

with Orq(api_key=os.getenv("ORQ_API_KEY", "")) as orq:
    datapoints = []
    for i in range(1000):
        datapoints.append({
            "inputs": {"number": i, "operation": "square"},
            "messages": [
                {"role": "user", "content": f"What is {i} squared?"},
                {"role": "assistant", "content": f"{i} squared is {i**2}"}
            ],
            "expected_output": str(i**2)
        })

    res = orq.datasets.create_datapoint(
        dataset_id="DATASET_ID",
        request_body=datapoints
    )
    print(f"Created {len(res)} datapoints")

See the full Create Datapoints API reference.

Add datapoints to an existing dataset:

Import this JSON array as datapoints into the "Support Training Data" dataset

The assistant uses create_datapoints to add entries in batches (max 100 per call).

Update a datapoint:

Update the expected output of datapoint ID "abc123" in the "user-queries" dataset to "New expected answer"

The assistant uses update_datapoint with the datapoint ID and updated fields.

Clean up a dataset:

Delete all datapoints in the "staging-tests" dataset that have an empty expected_output field

The assistant uses list_datapoints to retrieve all entries, filters for empty expected_output, then uses delete_datapoints to remove them in batches.

Create Image Datasets

AI Studio
API & SDK

Start by creating a dataset, then add messages with images.When editing a message cell:

Click Add image in the message editor.

Choose how to provide the image:
- Upload locally: Select a file from your computer.
- Enter URL: Paste an image URL directly.

URL input field for entering an image URL, with a Select image button to browse locally.

Animated walkthrough of adding an image to a dataset message by URL or local file upload.

Supported formats: JPEG, PNG, GIF, WebP.

Images must be encoded as base64 data URLs before adding to the dataset.

Create a Dataset

curl -X POST https://api.orq.ai/v2/datasets \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "display_name": "Image Analysis Dataset",
    "path": "Default"
  }'

Convert Images to Base64

import base64

def image_to_base64(image_path):
    with open(image_path, "rb") as f:
        encoded = base64.b64encode(f.read()).decode("utf-8")
    ext = image_path.lower().split(".")[-1]
    mime_types = {"png": "image/png", "gif": "image/gif", "webp": "image/webp",
                  "jpg": "image/jpeg", "jpeg": "image/jpeg"}
    mime_type = mime_types.get(ext, "image/jpeg")
    return f"data:{mime_type};base64,{encoded}"

Add Images to the Dataset

DATASET_ID="DATASET_ID"
IMAGE_DATA=$(base64 < "/path/to/image.jpg" | tr -d '\n')

curl -X POST "https://api.orq.ai/v2/datasets/$DATASET_ID/datapoints" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image"},
        {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,'$IMAGE_DATA'", "detail": "auto"}}
      ]
    }]
  }'

Detail parameter:

Value	Behaviour
`auto` (recommended)	Automatically optimises based on image size
`low`	Faster processing, lower token usage
`high`	More detailed analysis, higher token usage

Supported formats:

Format	MIME Type	Extension
JPEG	`image/jpeg`	`.jpg`, `.jpeg`
PNG	`image/png`	`.png`
GIF	`image/gif`	`.gif`
WebP	`image/webp`	`.webp`

Common errors:

Error	Cause	Solution
Invalid API key	Authentication failed	Check your API key in the workspace settings
File not found	Image path is incorrect	Verify the path and file permissions
Unsupported format	Format not supported	Convert to JPEG, PNG, GIF, or WebP
Payload too large	Image file is too large	Compress or resize before upload

Create Curated Datasets

Curated datasets are human-evaluated input and output sets: a prompt paired with a verified expected output. They are used for fine-tuning and as a gold-standard reference in Experiments. Within any module, open the Logs tab and select a log entry. The Feedback panel appears on the right. To add a correction, click Add correction below the assistant response:

Assistant response box with an Add correction button highlighted in red below it.

Edit the response in the Correction message that opens, then click Save.

Original assistant response shown in purple above a Correction box in green, with the corrected text entered and a Save button.

Click the Add to Dataset icon at the top-right of the response to save the corrected entry to a dataset:

Log view showing an Add to dataset dropdown with curated dataset selected, a Replace Inputs toggle, and an Add to dataset button.

Import a curated dataset into an Experiment, attach an Evaluator, and see which model or prompt scores best against the curated reference outputs.

List and Retrieve Datasets

API & SDK
MCP

List datasets:

curl --request GET \
     --url https://api.orq.ai/v2/datasets \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY'

Retrieve a dataset by ID:

curl --request GET \
     --url https://api.orq.ai/v2/datasets/DATASET_ID \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY'

See the List Datasets and Retrieve a Dataset API references.

List datasets in a project:

List all datasets in my workspace

The assistant uses search_entities with type: "dataset".

Find a specific dataset:

Find the "user-queries" dataset and show me its datapoints

The assistant uses search_entities to locate the dataset, then list_datapoints to retrieve its entries.

​Use Cases

​Create a Dataset

​Add Datapoints

​Create Image Datasets

​Create Curated Datasets

​List and Retrieve Datasets

Use Cases

Create a Dataset

Add Datapoints

Create Image Datasets

Create Curated Datasets

List and Retrieve Datasets