Agents | RubricHQ Docs

An Agent is the AI system you want to test — a call-center bot, customer-service assistant, or any voice or text agent. RubricHQ connects to your agent the same way a real user would (a phone call, a web call, or a text chat), runs your scenarios against it, and scores the results.

Anatomy of an agent

When you create an agent (AI Testing Agent → Create Agent) you configure:

Name — a label to identify the agent across runs and reports.
Instructions — the system prompt that defines how the agent behaves. This is also what the Prompt Optimizer iterates on.
First message — the opening line the agent speaks when a call connects.
Model — the LLM powering the agent (GPT-4o, Gemini, and others).
Language & voice — the speech language and voice provider used for voice channels.
Phone number — required for phone testing; the number RubricHQ dials.
Web call URL — the endpoint RubricHQ connects to for web (WebSocket / WebRTC) testing.

Channels

An agent can be tested over three channels. RubricHQ picks the default automatically — phone if the agent has a phone number, otherwise web — and you can override it per run.

Phone — a real call placed over Twilio.
Web — a browser / WebSocket voice call (LiveKit, Daily, Vapi, or Retell transports).
Text — a text-only conversation, with no audio.

The channel determines how RubricHQ reaches your agent, not how the agent is built. The same agent can be exercised over phone and web by running it on both channels.

Internal vs. external agents

Internal — an agent RubricHQ orchestrates end to end, including provisioning the voice room and joining it.
External — an agent hosted on a third-party platform (e.g. Vapi or Retell). RubricHQ connects over the vendor’s transport and drives the conversation as the caller.

For web transports that use rooms (LiveKit, Daily), you choose who creates the room:

RubricHQ-managed — we create the room and your agent joins it.
Client-managed — your bot runner creates the room and hands RubricHQ the credentials to join.

Standard metrics

Every voice run is automatically scored with standard metrics (latency, interruptions, silence, and more — see Metrics & Evaluation). You can disable specific standard metrics per agent if they don’t apply to your use case.

Once your agent is created, define Scenarios to describe the callers and goals it will be tested against, then launch a Batch Run.