Datasets
Prepare a Dataset for an Experiment.
Overview
Datasets hold Messages, Inputs and Expected Outputs for models to use within Experiments.
- Inputs – Variables that can be used in the prompt message, e.g., {{firstname}}.
- Messages – The prompt template, structured with system, user, and assistant roles.
- Expected Outputs – Reference responses that evaluators use to compare against newly generated outputs.
To get experiments ready, make sure you have models available by adding them to your Model Garden.
Note: You don’t need to include all three entities when uploading a dataset. Depending on your experiment, you can choose to include only inputs, messages, or expected outputs as needed. For example, you can create a dataset with just inputs.
Prerequisites
To get started with Datasets, make sure you have access to a Project and the sufficient Permissions within the workspace.
It can be recommended to be familiar with Playgrounds first, to understand how to configure model messages and Message Roles.
Setting up a Dataset
To create a new Dataset use the +
button on a Project Folder and select Dataset.
Once you enter a title for your dataset you will be taken to the Table View for your Dataset.
Your table has 3 columns:
- Inputs.
- Messages.
- Expected Outputs.
You can enter as many rows as needed in your Dataset.
Adding a Datapoint to the Dataset
To add a new datapoint click on the Add Row
button. Here by clicking on each cell, you can fill in corresponding data:
- (Optional) Key-value pair for each input added, sent as configuration for the model
- Messages details (including Message Roles for each message).
- (Optional) Expected Output a model should generate if given the message and input.
Messages can be used as Prompt Template during Experiment Runs, to learn more see Configuring a Prompt in Experiment.
Importing a CSV into a Data
To easily import Datasets you can choose to upload a .csv file containing your messages and reference details.
To do so, choose Import and drag-and-drop your file.
Your .csv file must use
,
as a delimiter.
You will then be able to configure the mapping from column in your csv to fields in the Dataset collection. Each row represents a separate datapoint.
data:image/s3,"s3://crabby-images/3b43b/3b43b2e7ae1b9d458dba673928bce6192812a119" alt="During import, the modal above opens to configure Mappings."
During import, the modal above opens to configure Mappings.
Inputs
Any number of column can be mapped to Inputs.
- The column name will be used as
key
for the input. - The row value will be used as
value
for the input
Messages
A single column can be mapped to Messages. The row value will be used as User Message for the prompt.
For Messages to be used within an Experiment, the Prompt needs to be configured to explicitely refer to the Dataset's message. To learn more, see Configuring a Prompt in Experiments.
Expected Output
A single column can be mapped to Expected Output, The row value will be used as Reference to compare outputs using Experiments.
data:image/s3,"s3://crabby-images/a19ce/a19ce0870f68d7901b751585b61021844b03b672" alt="Creating a Dataset from an imported CSV file."
Creating a Dataset from an imported CSV file.
Creating an Experiment from Dataset
The next step to use a Dataset is to create an Experiment.
To learn more about creating an Experiment, see Setting up an Experiment
Updated 9 days ago