improved

Improved LLMs as a Judge

LLM-as-a-Judge Enhancements:
We’ve significantly improved our existing LLM-as-a-Judge feature to provide more robust evaluation capabilities and enforce type-safe outputs.

1. Type-Safe Output:

  • When you configure the output type as a number, boolean, or other types, the LLM ensures the result adheres strictly to the defined format.
  • Additionally, each output includes a clear, concise explanation to support transparency and reliability in the evaluation process.

2. Extended Functionality:

  • The LLM-as-a-Judge now seamlessly integrates into live deployments, enabling asynchronous quality evaluation with a configurable sample rate. This approach provides actionable insights without disrupting workflows, ensuring consistent quality and minimizing drift.
  • The LLM-as-a-Judge also supports Guardrail Mode, actively enforcing quality standards by blocking inputs or outputs that fail to meet defined criteria.