Scenarios | RubricHQ Docs

A Scenario is a single test case: a simulated caller with a personality, an opening, and a goal. When RubricHQ runs a scenario, an AI-driven caller plays out that role against your agent so you can see how the agent handles it.

Anatomy of a scenario

Create a scenario under Scenarios → Create Scenario:

Persona — who the caller is and how they behave (e.g. “a polite customer in a hurry” or “a frustrated caller disputing a charge”). The persona shapes tone, patience, and style.
Instructions — what the caller is trying to do, step by step (e.g. “ask to make a full payment by card, give the card number when asked, then end the call”).
Opening prompt — what the caller says first, if the caller speaks before the agent.
Expected outcome — what a successful conversation looks like. This is the bar the agent is measured against.
Tags — labels used to group and select scenarios (for example, run only smoke scenarios in CI).

Attaching metrics

Each scenario carries the metrics it should be evaluated against. After a call completes, RubricHQ scores the transcript and audio against those metrics. Critical metrics determine whether the run passes — see Metrics & Evaluation.

A scenario’s metrics travel with it into every run. Standard voice metrics (latency, interruptions, silence, and so on) are applied automatically on top of the metrics you attach.

Generating scenarios

Rather than writing every scenario by hand, you can generate a set from a short description of your agent and the situations you want to cover. RubricHQ drafts personas, instructions, and outcomes that you can review and edit before saving.

Designing good scenarios

Keep each scenario focused — one goal per scenario makes pass/fail unambiguous and easier to debug.
Make the persona realistic — edge-case personas (impatient, confused, adversarial) surface failures that happy-path tests miss.
Write a concrete outcome — “caller completes the payment and the agent confirms the amount” is measurable; “the call goes well” is not.
Tag for selection — group scenarios by suite (smoke, regression, billing) so you can run the right subset in each environment.

Use a small, tagged smoke suite for fast CI gates and a larger tagged suite for nightly or pre-release runs.