Skip to main content

Introduction

orq.ai exposes an API to manipulate Evaluators. These APIs are used to manage Evaluators programmatically. In this page we’ll see the common use cases for creating, and fetching Evaluators through the API.

Prerequisite

To get started, an API key is needed to use within SDKs or HTTP API.
To get an API key ready, see Authentication.

SDKs

Creating an Evaluator

To create an Evaluator we’ll use the Create an Evaluator API call. We then need to decide what type of Evaluator we’ll create:

HTTP Evaluator

Here is a valid payload to create an HTTP evaluator:
{
  "type": "http_eval",
  "method": "POST",
  "headers": {
    "header-key": "header-value"
  },
  "payload": {
    "body-key": "body-value"
  },
  "url": "https://myevaluatorendpoint.com/api",
  "path": "Default/Evaluators",
  "key": "MyEvaluator"
}
To learn more about building HTTP Evaluators, see Creating an HTTP Evaluator.

JSON Evaluator

Here is a valid payload to create a JSON evaluator:
Make sure to correctly escape the JSON Schema payload.
{
  "guardrail_config": {
    "enabled": true,
    "type": "boolean",
    "value": true
  },
  "type": "json_schema",
  "schema": "{   \"$schema\": \"http://json-schema.org/draft-07/schema#\",   \"$id\": \"https://example.com/person.schema.json\",   \"title\": \"Person\",   \"description\": \"A person object\",   \"type\": \"object\",   \"properties\": {     \"firstName\": {       \"type\": \"string\",       \"description\": \"The person's first name\"     },     \"lastName\": {       \"type\": \"string\",       \"description\": \"The person's last name\"     },     \"age\": {       \"type\": \"integer\",       \"minimum\": 0,       \"maximum\": 150,       \"description\": \"Age in years\"     },     \"email\": {       \"type\": \"string\",       \"format\": \"email\",       \"description\": \"Email address\"     },     \"address\": {       \"type\": \"object\",       \"properties\": {         \"street\": {           \"type\": \"string\"         },         \"city\": {           \"type\": \"string\"         },         \"zipCode\": {           \"type\": \"string\",           \"pattern\": \"^[0-9]{5}(-[0-9]{4})?$\"         }       },       \"required\": [\"street\", \"city\", \"zipCode\"]     },     \"hobbies\": {       \"type\": \"array\",       \"items\": {         \"type\": \"string\"       },       \"uniqueItems\": true     }   },   \"required\": [\"firstName\", \"lastName\", \"email\"],   \"additionalProperties\": false }"
}
To learn more about building JSON Evaluators, see Creating a JSON Evaluator.

LLM Evaluator

Here is a valid payload to create an LLM evaluator:
{
  "type": "llm_eval",
  "prompt": "Give a number response from 0 to 1, 0 for innapropriate, 10 for perfectly appropriate {{log.output}}",
  "path": "Default/evaluators",
  "model": "openai/gpt-4o",
  "key": "myKey"
}
To learn more about building LLM Evaluators, see Creating an LLM Evaluator.

Python Evaluator

Here’s a valid Python Evaluator:
Use \n to indicate newlines in code.
{
  "type": "python_eval",
  "path": "Default/Evaluators",
  "key": "MyEvaluator",
  "code": "def evaluate(log):\n  output_size = len(log[\"output\"])\n  reference_size = len(log[\"reference\"])\n  return abs(output_size - reference_size)\n"
}
To learn more about building Python Evaluators, see Creating a Python Evaluator.

Guardrail Configuration

For each Evaluator payload you can also define a guardrail payload looking as follows and add it into the creation payload
{
  "guardrail_config": {
    "enabled": true,
    "type": "number", // can be also boolean
    "value": 5, // value needs to match type
    "operator": "lte" // defines operator to compare value with
  }
}

Calling the API

Here’s an example end-to-end API call and response:
curl --request POST \
     --url https://api.orq.ai/v2/evaluators \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY' \
     --header 'content-type: application/json' \
     --data '
{
  "guardrail_config": {
    "enabled": true,
    "type": "number",
    "value": 5,
    "operator": "lte"
  },
  "type": "python_eval",
  "path": "Default/Evaluators",
  "key": "MyEvaluator",
  "code": "def evaluate(log):\n  output_size = len(log[\"output\"])\n  reference_size = len(log[\"reference\"])\n  return abs(output_size - reference_size)\n"
}
'
The expected response is the following:
{
  "_id":"EVALUATOR_ID",
  "key":"MyEvaluator",
  "description":"",
  "created":"2025-06-26T11:37:02.132Z",
  "updated":"2025-06-26T11:37:02.132Z",
  "type":"python_eval",
  "code":"def evaluate(log):\n  output_size = len(log[\"output\"])\n  reference_size = len(log[\"reference\"])\n  return abs(output_size - reference_size)\n"
}

Listing Evaluators

To list evaluators we’re using the Listing Evaluators API. We’re making the following call:
curl --request GET \
     --url https://api.orq.ai/v2/evaluators \
     --header 'accept: application/json' \
     --header 'authorization: Bearer ORQ_API_KEY'
The resulting payload is the following:
{
  "object": "list",
  "data": [
    {
      "_id": "EVALUATOR_ID",
      "key": "BERT Score",
      "description": "Computes the similarity of two sentences as a sum of cosine similarities between their tokens embeddings",
      "created": "2024-12-16T12:36:40.359Z",
      "updated": "2024-12-16T12:36:40.359Z",
      "guardrail_config": {
        "enabled": false,
        "type": "number",
        "value": 0.3,
        "operator": "gt"
      },
      "type": "function_eval",
      "function_params": {
        "type": "bert_score"
      }
    },
    ...
}

Using Evaluators

Calling an Evaluator from the Library

We’ll be calling the Tone of Voice endpoint: Here is an example call:
The query defines the way the evaluator runs on the given output.
curl --request POST \
     --url https://api.orq.ai/v2/evaluators/tone_of_voice \
     --header 'accept: application/json' \
     --header 'authorization: Bearer <ORQ_API_KEY>' \
     --header 'content-type: application/json' \
     --data '
{
  "query": "Validate the tone of voice if it is professional.",
  "output": "Hello, how are you ??",
  "model": "openai/gpt-4o"
}
'
Here is the result returned by the API
The value here holds result of the evaluator call following the query
{
  "value": {
    "value": false,
    "explanation": "The output does not align with a professional tone. The use of 'Hello, how are you ??' is informal and lacks the formality expected in professional communication. The double question marks and casual greeting are more suited to a casual or friendly context rather than a professional one. A professional tone would require a more formal greeting and a clear purpose for the communication."
  }
}

Calling a custom evaluator

It is also possible to call a custom-made Evaluator made on orq using the API. You can fetch the Evaluator ID to send to this call by searching for Evaluators using the Get all Evaluators API. Then you can run the following API call:
curl 'https://api.orq.ai/v2/evaluators/<evaluator_id>/invoke' \
-H 'Authorization: Bearer <ORQ_API_KEY>' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
--data-raw '{
    "query": "Your input text",
    "output": "Your output text",
    "reference": "Optional reference text",
    "messages": [
        {
            "role": "user",
            "content": "Your message"
        }
    ],
    "retrievals": ["Your retrieval content"]
}' \
--compressed
Finally, through the Orq studio, find the View Code button on your Evaluator page. The following modal opens:

Evaluator code is available directly to use.

This code is used to call the current Evaluator through the API. Ensure the payload is containing all necessary data for the Evaluator to execute correctly.

Using EvaluatorQ

EvaluatorQ is a dedicated SDK for using Evaluators within your application. It features the following capabilities:
  • Parallel Execution: Run multiple evaluation jobs concurrently with progress tracking
  • Flexible Data Sources: Support for inline data, promises, and Orq platform datasets
  • Type-safe: Fully written in TypeScript
Installation:
npm install @orq-ai/evaluatorq
Usage example:
import { evaluatorq, job } from "@orq-ai/evaluatorq";

const textAnalyzer = job("text-analyzer", async (data) => {
  const text = data.inputs.text;
  const analysis = {
    length: text.length,
    wordCount: text.split(" ").length,
    uppercase: text.toUpperCase(),
  };

  return analysis;
});

await evaluatorq("text-analysis", {
  data: [
    { inputs: { text: "Hello world" } },
    { inputs: { text: "Testing evaluation" } },
  ],
  jobs: [textAnalyzer],
  evaluators: [
    {
      name: "length-check",
      scorer: async ({ output }) => {
        const passesCheck = output.length > 10;
        return {
          value: passesCheck ? 1 : 0,
          explanation: passesCheck
            ? "Output length is sufficient"
            : `Output too short (${output.length} chars, need >10)`,
        };
      },
    },
  ],
});
Learn more, see the Python EvaluatorQ and TypeScript EvaluatorQ repositories.