Overview
A scenario-based dataset is a collection of test cases written as free-form scenarios. Use these datasets to evaluate how your agent handles realistic user interactions. Each scenario test case primarily consists of two parts:Scenario
A natural language description of the user’s persona, intent, and relevant context.This defines who the user is, why they interact with the agent, and any background information that shapes the conversation.
Expected Behavior
A step-by-step description of how the agent should respond, written in plain language.
How It Works
Quraite uses the scenario description to generate realistic user messages and invokes the agent across multiple turns.Fail Fast
Multi-turn evaluations have a common challenge: if the agent deviates from the expected path early in the conversation, subsequent turns become meaningless. Quraite addresses this by evaluating after every turn. When the agent’s response fails to match the expected behavior, the evaluation stops immediately. This catches failures early and saves time and tokens.Scenario Completion
On success, the test case automatically advances to the next turn until the scenario completes.When to Use Scenario-Based Datasets
Scenario-based datasets work best when:- Production traces are unavailable. New agents or features lack real user data. Scenario-based datasets let teams define test cases in natural language before launch.
- Evaluations cover multiple user personas. Different users interact differently. A frustrated customer repeats questions and expresses impatience. A first-time user asks for clarification. A non-native speaker uses simpler vocabulary or unconventional phrasing.
- User messages need natural variation. Quraite generates different phrasings for each run based on the scenario description. A scenario like “user asks about refund policy” produces varied messages: “How do I get a refund?”, “What’s your return policy?”, “I want my money back.”
- The same scenario runs under different contexts. Test how context affects agent behavior. A pricing question from a free-tier user requires a different response than the same question from an enterprise customer.
Create and Run Scenario-Based Test Cases
This guide uses the Retail Agent in the Default Project. Quraite creates this project automatically at signup.
Select the Scenario dataset
Click on the Scenario-based Dataset from the list of datasets.
The Scenario-based Dataset includes sample test cases. Ignore these for now.
Defining scenarios takes time. But thorough tests build confidence in your agent.Automatic scenario generation is coming soon.
Next Steps
- Run the remaining sample test cases in the Scenario-based Dataset.
- Create a new project and run test cases against your own agent.

