Play Audio Action
Play a pre-recorded audio file on a phone call as part of an automated workflow.
Creating a Play Audio Action
A Play Audio action plays a pre-recorded audio file from your workspace files during an automated voice flow — for example a regulatory disclosure that has to be delivered verbatim, a hold-message, or a voicemail recording dropped on an answering machine.
How it fits together: "Play Audio" is an Action type. You attach it to a workflow task (or a call event) so that when the trigger fires, the assistant plays the audio file instead of speaking text. Pair it with Allow interrupt = off for required disclosures that must be heard end-to-end.
Prerequisites
- A pre-recorded audio file uploaded to your workspace Files. Supported formats: WAV, MP3, M4A.
- A workflow with the task that should play the audio (see Workflows), or a phone-call trigger you want to attach the action to.
- A voice-enabled assistant configured for the workflow.
Steps
1. Upload the audio file
Go to Files in your workspace and upload the audio file the action will play. WAV at 24 kHz mono produces the highest fidelity on a call leg; MP3 / M4A also work.
2. Open the Actions page
Go to Actions in your workspace and click New Action.
3. Configure the trigger
Choose the trigger that should play the clip — typically:
- Task Entered for a clip that plays when the call reaches a specific workflow task (e.g. an opening disclosure on the first task).
- Call Started for a clip that plays the moment a call connects (e.g. a voicemail-drop recording).
4. Select "Play Audio" as the action type
Pick Play Audio from the action-type dropdown. The Play Audio form appears.
5. Pick the audio clip
Use the file picker to select the audio file you uploaded. The picker lists files whose mime type begins with audio/ (plus .m4a voice-memos). Click the Preview button next to the picker to listen to the clip in the browser before saving.
6. Set the volume
The Volume slider controls server-side gain that's applied as the clip is streamed to the call leg.
- 80% is the unity default — no adjustment, the call hears the file as-recorded.
- Higher values boost the level; lower values attenuate.
- The browser preview tracks the slider for attenuation but caps at unity (it can't boost above the source's natural level). Above-unity gain only takes effect on the live call.
If your recordings are already mastered loud, leave the slider at 80%. If a recording was captured at a low level, raise the slider until the in-browser preview is comfortably audible (it will be slightly louder again on the live call).
7. Allow interrupt
The Allow interrupt switch controls whether the caller can interrupt the clip by speaking.
- On (default) — speaking interrupts the clip and hands the turn back to the assistant. Right for greetings, hold-music, and most narrative content.
- Off — the clip plays end-to-end. Server VAD ignores barge-in, and any final transcript that arrives mid-clip is deferred until playback completes (so something the caller said before the clip began doesn't interrupt the disclosure). Right for regulatory disclosures, recorded notices that must be delivered verbatim, and any clip that has to be heard in full.
8. Hang up after
The Hang up after switch ends the call immediately when the clip finishes — no further AI turns, no follow-up questions.
Pair this with Allow interrupt = off for voicemail drops: the assistant calls, leaves the recorded message even if the answering machine starts beeping, and then disconnects without the AI improvising.
9. Save the action
Click Save. The action is now linked to its trigger and fires whenever the trigger event happens.
Recipes
Mandatory regulatory disclosure on call connect
| Setting | Value |
|---|---|
| Trigger | Call Started |
| Audio clip | disclosure-2026.wav (your recording) |
| Volume | 80% |
| Allow interrupt | Off |
| Hang up after | Off |
Plays the disclosure verbatim the moment the call connects. The caller cannot talk over it; mid-clip transcripts are queued and delivered to the assistant once the disclosure finishes.
Voicemail drop with hang-up
| Setting | Value |
|---|---|
| Trigger | Task Entered (a "leave voicemail" task) |
| Audio clip | voicemail-promo.mp3 |
| Volume | 80–95% (recordings tend to need a slight boost over voicemail compression) |
| Allow interrupt | Off |
| Hang up after | On |
Plays the recorded message and disconnects. The AI never improvises — what the answering machine captures is exactly what was uploaded.
Hold music while a tool runs
| Setting | Value |
|---|---|
| Trigger | Task Entered (a "lookup pending" task) |
| Audio clip | short-hold.mp3 |
| Volume | 70% (slightly attenuated below speech) |
| Allow interrupt | On |
| Hang up after | Off |
Lets the caller barge in to ask a follow-up question. The assistant resumes the conversation as soon as the lookup completes.
Notes & limitations
- Audio clips are workspace files, not member-specific recordings. If you need a per-member dynamic message (e.g. "Hi {{member.first_name}}…"), use Assistant Message with text instead — the assistant will TTS the message on the fly.
- The clip's persisted message appears in the chat transcript as an audio bubble with a play button, so reviewers can hear exactly what was delivered. The bubble shows
plays end-to-end/then hangs upannotations when those switches were set. - Replay in the web voice console plays the persisted clip (not a fresh TTS), so what you hear during replay is what the caller heard.
Related Resources
Workflows
Build AI-powered conversation flows with tasks, abilities, and agents.
Actions
Automate your workspace with event-triggered actions, notifications, and webhooks.
Experiments (A/B Testing)
Run member-level A/B tests with weighted groups, CEL branching, and journey goal reporting.
All Guides
Browse all available guides