Qualifications
Verify that operators and AI agents are competent to handle specific workflows before routing real interactions.
Qualifications
Qualifications let you verify that a member -- human or AI -- is ready to handle a specific workflow. Before an operator goes live or an agent is assigned real interactions, they complete requirements (knowledge questions or simulated chat scenarios) that are graded automatically and reviewed by an admin.
This ensures that every person or agent interacting with patients meets a measurable standard of competency before they handle real cases.
Key Concepts
Qualification
A qualification defines what "competent" means for a particular workflow. Each workflow can have at most one active qualification.
A qualification has:
- Name and description
- Linked workflow -- which workflow this qualification applies to
- Requirements -- what a member must do to qualify
- Auto-advance -- whether members automatically progress when all requirements pass
Status Lifecycle
Members progress through these statuses:
| Status | Meaning | Can Handle Real Interactions? |
|---|---|---|
| Enrolled | Working on requirements | No |
| Submitted | All requirements passed | No |
| Provisional | Admin approved submitted work | Yes (with monitoring) |
| Qualified | Admin confirmed real-world readiness | Yes |
| Suspended | Temporarily removed | No |
Key transitions:
- Enrolled -> Submitted -- Automatic when all requirements have passing submissions (if auto-advance is on)
- Submitted -> Provisional -- Admin reviews and approves the submitted work
- Provisional -> Qualified -- Admin confirms the member performs well on real interactions
- Any -> Suspended -- Admin action (requires a reason)
The Provisional -> Qualified step is always a human decision. Passing automated tests does not prove real-world readiness.
Requirements
Requirements define what a member must complete. Two types:
Question requirements -- The member answers a written knowledge question. Use these for policy knowledge, protocol understanding, or situational judgment.
Chat requirements -- The member completes a simulated conversation through the workflow. A test scenario is set up with a simulated persona, and an AI plays the role of the caller/patient. Use these for practical skill evaluation.
Requirements are ordered and identified by a unique slug within the qualification.
Criteria
Each requirement is graded by one or more criteria:
Formula criteria -- Automatic pass/fail using CEL (Common Expression Language) expressions. For example: answer.contains("HIPAA") for a question, or assignment.outcome == "success" for a chat scenario. See Writing Formula Criteria for the full reference.
AI Evaluation criteria -- LLM-evaluated using a prompt template. You write a Jinja2 prompt describing what to evaluate, and an AI scores the response on a 0.0--1.0 scale. See Writing AI Evaluation Criteria for the full reference.
Each criterion has a weight (for overall scoring) and a passing threshold (for AI Evaluation criteria).
Setting Up Qualifications
1. Create a Qualification
- Go to People -> Qualifications
- Click Create Qualification
- Enter a name, optional description, and select the workflow
- Choose a color for visual identification
2. Add Requirements
- Open the qualification and go to the Requirements tab
- Click Add Requirement
- Configure:
- Slug -- Unique identifier (e.g.,
identity-verification) - Name -- Display name
- Type -- Question or Chat
- For questions: enter the Question Text
- For chats: configure the Scenario Definition (see below)
- Slug -- Unique identifier (e.g.,
3. Add Criteria
For each requirement:
- Click Add Criterion
- Choose Formula or AI Evaluation
- For Formula criteria: write a CEL expression (see Writing Formula Criteria)
- For AI Evaluation criteria: write an evaluation prompt template (see Writing AI Evaluation Criteria)
- Set the weight and passing threshold
4. Enroll Members
- Go to a member's profile -> Qualifications tab
- Click Assign Qualification
- The member starts in Enrolled status
Chat Scenario Definition
Chat requirements use a scenario definition to set up the simulated environment:
yaml
- persona -- Defines the simulated caller. The
identityfield is the system prompt that drives the AI playing the caller role. Write it in second person ("You are...") and include personality traits, goals, and constraints that make the scenario realistic. - dataRecords -- Pre-populated data records for the test member, keyed by data type slug. These appear in the system as if they already existed, so the agent or operator can look them up.
- labels -- Labels applied to the test member before the scenario starts.
The scenario definition only sets up the test environment. Pass/fail evaluation is handled by the criteria you define on the requirement.
Writing Formula Criteria (CEL Expressions)
Formula criteria use CEL (Common Expression Language) expressions that evaluate to true (pass) or false (fail). The available context variables depend on whether the requirement is a question or a chat.
Context Variables for Question Requirements
| Variable | Type | Description |
|---|---|---|
answer | string | The member's answer text |
question | string | The requirement's question text |
member | object | {id, name, labels, form_data} -- the member being evaluated |
Context Variables for Chat Requirements
Aggregate variables
These provide summary information about the completed chat:
| Variable | Type | Description |
|---|---|---|
assignment | object | {status, outcome, is_test} -- the assignment result |
chat | object | {channel, message_count, duration_seconds} -- conversation stats |
assignment_tasks | object | {visited: [...], visited_count} -- which tasks (steps) were visited |
member | object | {id, name, labels, form_data} -- the member being evaluated |
The messages list
The messages variable is a normalized sequential list of ALL messages in the conversation. This is the most powerful context variable for chat evaluation -- it lets you inspect exactly what happened, in what order.
Each message in the list has the following fields. All fields are always present (empty string or empty list when not applicable), so you can safely access any field without null checks.
| Field | Type | Description |
|---|---|---|
index | int | Position in the conversation (0-based) |
role | string | "user", "assistant", "operator", or "tool_result" |
content | string | The text content of the message |
tool_calls | list | Tool calls made in this message. Each has {name, arguments} |
tool_name | string | For tool_result messages, the name of the tool that produced this result |
Example CEL Expressions
Question criteria
Check answer length:
cel
Check for required keywords:
cel
Combine length and content checks:
cel
Chat outcome checks
Verify the assignment completed successfully:
cel
Require a minimum conversation length:
cel
Require the conversation finished within a time limit:
cel
Verify a specific task (step) was visited during the conversation:
cel
Tool call existence
Check that the agent called a specific tool at any point:
cel
Tool call with specific arguments
Verify a tool was called with particular argument values:
cel
Ordering -- tool A before tool B
Verify that identity was checked before medical info was shared:
cel
Negative assertions
Forbidden tool -- the agent must never escalate:
cel
Forbidden word (case-sensitive):
cel
Case-insensitive check via regex:
cel
Only check assistant messages for forbidden content:
cel
Tool result checks
Verify that a tool returned a specific result:
cel
Counting
Limit how many times a tool was called (e.g., no more than 3 searches):
cel
Combined real-world example
A complete criterion that checks multiple behaviors at once -- the agent must look up the appointment, then reschedule it, confirm the change to the caller, and never escalate:
cel
CEL Quick Reference
| Function | Description |
|---|---|
size(list) / size(string) | Returns the length of a list or string |
string.contains("substr") | Returns true if the string contains the substring |
string.matches("regex") | Returns true if the string matches the regex. Supports (?i) for case-insensitive matching |
list.exists(x, condition) | Returns true if any element in the list satisfies the condition |
list.all(x, condition) | Returns true if every element in the list satisfies the condition |
list.filter(x, condition) | Returns a new list containing only elements that satisfy the condition |
"value" in list | Returns true if the value is contained in the list |
Writing AI Evaluation Criteria (Jinja2 Templates)
AI Evaluation criteria use a Jinja2 prompt template that is rendered with context variables and sent to an LLM. The LLM returns a structured evaluation: a score (0.0--1.0), a pass/fail determination based on the passing threshold you set, and written reasoning.
Use AI Evaluation criteria when the judgment is too nuanced for a formula -- things like empathy, communication quality, clinical accuracy, or adherence to complex protocols.
Available Template Variables
For question requirements
| Variable | Description |
|---|---|
{{ requirement.name }} | The requirement's display name |
{{ requirement.question }} | The question text that was asked |
{{ submission.answer }} | The member's submitted answer |
{{ member.name }} | The member's name |
{{ member.labels }} | Labels assigned to the member |
{{ answers.other_requirement_slug }} | The member's answer to another requirement (by slug). Useful for cross-referencing. |
For chat requirements
| Variable | Description |
|---|---|
{{ transcript }} | List of message dicts from the conversation |
{{ assignment.status }} | The assignment's final status |
{{ assignment.outcome }} | The assignment's outcome (e.g., "success", "failure") |
{{ requirement.scenario }} | The scenario definition YAML |
{{ member.name }} | The member's name |
Example Prompts
Question evaluation
jinja2
Chat evaluation
jinja2
Cross-referencing other answers
You can reference the member's answers to other requirements by slug. This is useful for evaluating consistency between what someone says they will do and what they actually do in a simulation.
jinja2
Evaluation Pipeline
When a member submits an answer or completes a chat scenario:
- Formula criteria run first (CEL auto-evaluation)
- If any Formula criterion fails -> Auto Failed (AI Evaluation criteria are skipped)
- AI Evaluation criteria run next (LLM evaluation)
- If AI Evaluation criteria exist -> Pending Review (needs human review)
- If only Formula criteria and all pass -> Auto Passed
- An overall score is computed as a weighted average of all criterion scores
Human Review
Admins can review any submission:
- Go to the qualification -> Submissions tab
- Click a submission to see criterion results
- Approve or Reject the submission
- Optionally add review notes
- Override individual criterion scores if needed
After approval, if auto-advance is enabled and all requirements now have passing submissions, the member automatically advances from Enrolled to Submitted.
Permissions
| Action | Required Scope |
|---|---|
| View qualifications and submissions | members:read |
| Create/edit qualifications, requirements, criteria | members:write |
| Enroll members, review submissions | members:write |
| Archive/delete qualifications | members:admin |
Tips
- Start with Formula criteria for clear-cut requirements (e.g., "assignment must complete successfully"), then add AI Evaluation criteria for nuanced evaluation.
- Use
messages.exists()for tool call checks instead of relying solely on aggregate variables -- it gives you ordering context and argument-level inspection. - Start simple.
assignment.outcome == "success"catches most chat failures. Add more specific criteria only when you need them. - Use
(?i)inmatches()for case-insensitive word checks in CEL expressions. - Test your CEL expressions against sample data before deploying them to a live qualification.
- Use auto-advance to reduce manual work -- members progress automatically when requirements pass.
- Provisional status is your safety net -- let members handle real interactions under monitoring before granting full qualification.
- One qualification per workflow keeps things simple -- if a workflow needs different skill levels, use different requirements within one qualification.
- Review AI Evaluation criteria results carefully -- LLM evaluation is helpful but not infallible; the human review step exists for a reason.
Related
- Live Operator Mode — Set up human operators that qualifications gate
- CEL Expressions — Write formula criteria for automated evaluation
- Workflows — Build the workflows that qualifications apply to
Related Resources
Members
Add, invite, and manage the humans in your workspace — team and customers alike.
Importing Members
Import members in bulk via CSV upload or the API — format your file, handle duplicates, and troubleshoot errors.
Roles & Permissions
How member roles and permission scopes control access to workspace features and data.
All Guides
Browse all available guides