Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

You can also create an Evaluator using the API, see Creating an Evaluator via the API.

To start building a Python Evaluator , head to a Projects, use the + button and select Evaluator. The following modal opens:

Select the Evaluator type

You’ll be then taken to the code editor to configure your Python evaluation. To perform an evaluation, you have access to the log of the Evaluated Model, which contains the following three fields:

log["input"] <str> The last message sent to generate the output.
log["output"] <str> The generated response from the model.
log["reference"] <str> The reference used to compare the output.
log["messages"] list<str> All previous messages sent to the model.
log["retrievals"] list<str> All Knowledge Base retrievals.

The evaluator can be configured with two different response types:

Number to return a score
Boolean to return a true/false value

The following example compares the output size with the given reference.

def evaluate(log):
    output_size = len(log["output"])
    reference_size = len(log["reference"])
    return abs(output_size - reference_size)

You can define multiple methods within the code editor, the last method will be the entry-point for the Evaluator when run.

Environment and Libraries

The Python Evaluator runs in the following environment: python 3.12 The environment comes preloaded with the following libraries:

numpy==1.26.4
nltk==3.9.1
json
re

Testing an Evaluator

Within the Studio, a Playground is available to test an evaluator against any output. This helps validates quickly that an evaluator is behaving correctly To do so, first configure the request:

Here you can configure the payload that will be sent to a Python evaluator.

Use the Run button to execute your evaluator with the request payload. The result will be displayed in the Response field.

A Python test response.

Guardrail Configuration

Within a Deployment, you can use your Python Evaluator as a Guardrail, blocking potential calls to Enabling the Guardrail toggle will block payloads that don’t validate the given JSON Schema. Once created the Evaluator will be available to use in Deployments, to learn more see Evaluators & Guardrails in Deployments.

Python Evaluator Ragas Evaluator

Getting Started

Reference

Admin

Creating a Python Evaluator

Environment and Libraries

Testing an Evaluator

Guardrail Configuration

Getting Started

Reference

Admin

​Environment and Libraries

​Testing an Evaluator

​Guardrail Configuration

Environment and Libraries

Testing an Evaluator

Guardrail Configuration