Curated datasets
Curated datasets are used to fine-tune models. Here we see how to create them.
What are curated datasets
A normal dataset consists of a prompt and an attached reference. Within Orq.ai you have the ability to create curated datasets.
Curated datasets are human-evaluated input and output sets. In other words, you have a set with a prompt with an expected output.
These curated datasets can be used to fine-tune a model. Because you're providing a model with specific examples, you can help tune its behavior more precisely to your needs.
Creating curated datasets
Within any module, by choosing the Logs tab and then selecting a single Log, the Feedback panel will be displayed on the right.
A domain expert can review the logs to verify and correct the output.
To add a correction, find the Add correction button:
Choosing to add a correction will open a new Correction message in which you can edit manually the response provided by the model. Once you have finished editing, select Save to use that correction.
Once chosen and eventually corrected, you can choose to add the response to a Curated Dataset. To do so select the Add to Dataset icon at the top-right of the response:
Here you will see a list of datasets you can add the log to, to create a new Dataset, see Datasets.
Using a curated dataset with Experiments
Another common use case is using curated datasets within the Experiments module.
When Setting up an Experiment, import the curated dataset, attach the Cosine Similarity Evaluators, and see which model and/or prompt scores the best in relation to the Curated Dataset.
To learn more about using datasets in experiments, see Datasets in Experiments.
Updated 3 months ago