improved

Improved LLMs as a Judge

7 months ago by Kyra Dresen

LLM-as-a-Judge Enhancements:
We’ve significantly improved our existing LLM Evaluator feature to provide more robust evaluation capabilities and enforce type-safe outputs.

1. Type-Safe Output:

When you configure the output type as a number, boolean, or other types, the LLM ensures the result adheres strictly to the defined format.
Additionally, each output includes a clear, concise explanation to support transparency and reliability in the evaluation process.

2. Extended Functionality:

The LLM-as-a-Judge now seamlessly integrates into live deployments, enabling asynchronous quality evaluation with a configurable sample rate. This approach provides actionable insights without disrupting workflows, ensuring consistent quality and minimizing drift.
The LLM-as-a-Judge also supports Guardrail Mode, actively enforcing quality standards by blocking inputs or outputs that fail to meet defined criteria.