
Create a new Evaluator.
HTTP Evaluator
HTTP evaluators allow users to set up a custom evaluation by calling an external API, enabling flexible and tailored assessments of generated responses. This approach lets users leverage their own or third-party APIs to perform specific checks, such as custom quality scoring, compliance verification, or domain-specific validations that align with their unique requirements. When creating an HTTP Evaluator, define the following details of your Evaluator:| Field | Description |
|---|---|
| URL | The API Endpoint. |
| Headers | Key-value pairs for HTTP Headers sent during evaluation. |
| Payload | Key-value pairs for HTTP Body sent during evaluation. |
Payload Detail
Here are the payload variables accessible to perform evaluationExpected Response Payload
For an HTTP Evaluator to be valid, orq.ai is expecting a certain response payload returning the evaluation result.Boolean Response
You can decide to return a Boolean response to the evaluation, the following is the expected payload:Number Response
You can decide to return a Number response to the evaluation, the following is the expected payload:String Response
You can decide to return a String response to the evaluation, the following is the expected payload:Example for HTTP Evaluators
HTTP Evaluators can be useful to implement business or industry-specific checks from within your applications. You can build an Evaluator using an API on your systems that will perform a compliance check for instance. This HTTP Evaluator has agency over calls routed through orq.ai while keeping business intelligence and logic within your environments. This ensures that generated content adheres to your organization’s specific regulatory guidelines. For example, in case the content is not adhering to regulatory guidelines, the HTTP call could return the following, failing the evaluator along the way.Guardrail Configuration
Within a Deployment, you can use your HTTP Evaluator as a Guardrail, effectively preventing deployments to respond to a user depending on the Input or Output. Here you can define the guardrail condition:- if the HTTP Evaluator returns a value higher than the defined value, the call is accepted.
- If the HTTP Evaluator returns a value lower or equal to the defined value, the call is then denied.
Testing an HTTP Evaluator
Within the Studio, a Playground is available to test an evaluator against any output. This helps validate quickly that an evaluator is behaving correctly To do so, first configure the request:
Here you can configure all fields that will be sent to an evaluator.

An HTTP test response.
JSON Evaluator
JSON Evaluators allow users to validate JSON payloads against JSON Schemas, ensuring correct incoming or outgoing payload for your model. When creating an evaluator, you can specify a JSON Schema that will be used.A JSON Schema lets you define which fields you want to find in the evaluated payload Here is an example defining two mandatory fields: title and length
Testing a JSON Evaluator
Within the Studio, a Playground is available to test an evaluator against any output. This helps validate quickly that an evaluator is behaving correctly To do so, first configure the request:
Here you can configure the JSON payload that will be sent to an evaluator.

A JSON test response.
Guardrail Configuration
Within a Deployment, you can use your JSON Evaluator as a Guardrail, effectively permitting a JSON Validation on input and output for a deployment generation. Enabling the Guardrail toggle will block payloads that don’t validate the given JSON Schema. Once created the Evaluator will be available to use in Deployments, to learn more see Evaluators & Guardrails in Deployments.LLM Evaluator
Unlike Function Evaluators, LLM Evaluators assess the context and provide human-like judgments on the quality or appropriateness of content. When creating your LLM evaluator, select the model you would like to use to evaluate the output (the model needs to be enabled in your Model Garden). Choose which type of output your model evaluation will provide:- Boolean, if the evaluation generates a True/False response.
- Number, if the evaluation generates a Score.
Configure Prompt
Your prompt has access to the following string variables:{{log.input}}contains the last message sent to the model{{log.output}}contains the output response generated by the evaluated model{{log.messages}}contains the messages sent to the model, without the last message{{log.retrievals}}contains Knowledge Base retrievals.{{log.reference}}contains the reference used to compare output
Example
Evaluating the Familiarity of an outputTesting an LLM Evaluator
Within the Studio, a Playground is available to test an evaluator against any output. This helps validate quickly that an evaluator is behaving correctly To do so, first configure the request:
Here you can configure the LLM payload that will be sent to an evaluator.

A LLM Evaluator test response.
Guardrail Configuration
Within a Deployment, you can use your LLM Evaluator as a Guardrail, effectively permitting a validation on input and output for a deployment generation. Enabling the Guardrail toggle will block payloads that don’t meet a score or expected boolean response. Once created the Evaluator will be available to use in Deployments, to learn more, see Evaluators & Guardrails in Deployments.Python Evaluator
Python Evaluators enable users to write custom Python code to create tailored evaluations, offering maximum flexibility for assessing text or data. From simple validations (e.g. regex patterns, data formatting) to complex analyses (e.g. statistical checks, custom scoring algorithms), they execute user-defined logic to measure specific criteria. When creating a Python Evaluator, You are taken to the code editor to configure your Python evaluation. To perform an evaluation, you have access to the log of the Evaluated Model, which contains the following three fields:log["input"]<str>The last message sent to generate the output.log["output"]<str>The generated response from the model.log["reference"]<str>The reference used to compare the output.log["messages"]list<str>All previous messages sent to the model.log["retrievals"]list<str>All Knowledge Base retrievals.
- Number to return a score
- Boolean to return a true/false value
The following example compares the output size with the given reference.
You can define multiple methods within the code editor, the last method will be the entry-point for the Evaluator when run.
Environment and Libraries
The Python Evaluator runs in the following environment:python 3.12
The environment comes preloaded with the following libraries:
Testing a Python Evaluator
Within the Studio, a Playground is available to test an evaluator against any output. This helps validate quickly that an evaluator is behaving correctly To do so, first configure the request:
Here you can configure the payload that will be sent to a Python evaluator.

A Python test response.