Script-Based Datasets

Overview

A script-based dataset is a collection of test cases written as a conversation script. This gives you full control over the conversation flow and user messages.

Each script test case primarily consists of multiple turns of conversation.

At every turn, you can specify the exact user message to be used in that turn and optionally the expected agent behavior.

Quraite provides three ways to define the expected agent response:

Exact Match

The agent’s response must match the expected response exactly.

Regex Match

The agent’s response must match the expected response pattern using regex.

Semantic Match

The agent’s response must be judged by an LLM to match the expected response.

Quraite also supports evaluating tool calls at each turn.

For multiple tool calls, specify the expected order:

In Order

Tool calls match the specified sequence.

Any Order

Tool calls match regardless of sequence.

Quraite offers flexible evaluation options: check tool names only, or both names and arguments.

Create and Run Script-Based Test Cases

This guide uses the Retail Agent in the Default Project. Quraite creates this project automatically at signup.

Navigate to Projects page

In the Quraite dashboard, navigate to the Projects page.

Navigate to the Default Project

Click on the Default Project in the list of projects.

Navigate to the Datasets page

Click on the Datasets in the left sidebar.

Select the Script dataset

Click on the Script-based Dataset from the list of datasets.

Create a new test case

This test case tests retrieving order status.Click + Test CaseEnter Turn 1 detailsUser message:

Hi, I want to check status of my order W3372648

Expected agent behavior:

Select Evaluation Approach as LLM.
Enter Expected Content:

Agent asks for user authentication details.

Enter Turn 2 detailsUser message:

My email address is yara.johansson3155@example.com

Expected agent behavior:

Select Evaluation Approach as Regex.
Enter Expected Content:

pending

Click + Tool Call.
Select Tool Call Evaluation Type as In Order.
Enter tool call name:

find_user_id_by_email

Enter tool call arguments:

email: "yara.johansson3155@example.com"

Click + Tool Call.
Enter tool call name:

get_order_details

Enter tool call arguments:

order_id: "#W3372648"

Run the test case

Select Retail Agent from the Select Agent dropdown.
Click Run.

Code-based script test case definitions are coming soon.

Overview

Exact Match

Regex Match

Semantic Match

In Order

Any Order

How It Works

When to Use Script-Based Datasets

Create and Run Script-Based Test Cases

Next Steps

​Overview

Exact Match

Regex Match

Semantic Match

In Order

Any Order

​How It Works

​When to Use Script-Based Datasets

​Create and Run Script-Based Test Cases

​Next Steps

Overview

How It Works

When to Use Script-Based Datasets

Create and Run Script-Based Test Cases

Next Steps