Play Audio Action

Play a pre-recorded audio file on a phone call as part of an automated workflow.

Intermediate
5 min read

Creating a Play Audio Action

A Play Audio action plays a pre-recorded audio file from your workspace files during an automated voice flow — for example a regulatory disclosure that has to be delivered verbatim, a hold-message, or a voicemail recording dropped on an answering machine.

How it fits together: "Play Audio" is an Action type. You attach it to a workflow task (or a call event) so that when the trigger fires, the assistant plays the audio file instead of speaking text. Pair it with Allow interrupt = off for required disclosures that must be heard end-to-end.

Prerequisites

  • A pre-recorded audio file uploaded to your workspace Files. Supported formats: WAV, MP3, M4A.
  • A workflow with the task that should play the audio (see Workflows), or a phone-call trigger you want to attach the action to.
  • A voice-enabled assistant configured for the workflow.

Steps

1. Upload the audio file

Go to Files in your workspace and upload the audio file the action will play. WAV at 24 kHz mono produces the highest fidelity on a call leg; MP3 / M4A also work.

2. Open the Actions page

Go to Actions in your workspace and click New Action.

3. Configure the trigger

Choose the trigger that should play the clip — typically:

  • Task Entered for a clip that plays when the call reaches a specific workflow task (e.g. an opening disclosure on the first task).
  • Call Started for a clip that plays the moment a call connects (e.g. a voicemail-drop recording).

4. Select "Play Audio" as the action type

Pick Play Audio from the action-type dropdown. The Play Audio form appears.

5. Pick the audio clip

Use the file picker to select the audio file you uploaded. The picker lists files whose mime type begins with audio/ (plus .m4a voice-memos). Click the Preview button next to the picker to listen to the clip in the browser before saving.

6. Set the volume

The Volume slider controls server-side gain that's applied as the clip is streamed to the call leg.

  • 80% is the unity default — no adjustment, the call hears the file as-recorded.
  • Higher values boost the level; lower values attenuate.
  • The browser preview tracks the slider for attenuation but caps at unity (it can't boost above the source's natural level). Above-unity gain only takes effect on the live call.

If your recordings are already mastered loud, leave the slider at 80%. If a recording was captured at a low level, raise the slider until the in-browser preview is comfortably audible (it will be slightly louder again on the live call).

7. Allow interrupt

The Allow interrupt switch controls whether the caller can interrupt the clip by speaking.

  • On (default) — speaking interrupts the clip and hands the turn back to the assistant. Right for greetings, hold-music, and most narrative content.
  • Off — the clip plays end-to-end. Server VAD ignores barge-in, and any final transcript that arrives mid-clip is deferred until playback completes (so something the caller said before the clip began doesn't interrupt the disclosure). Right for regulatory disclosures, recorded notices that must be delivered verbatim, and any clip that has to be heard in full.

8. Hang up after

The Hang up after switch ends the call immediately when the clip finishes — no further AI turns, no follow-up questions.

Pair this with Allow interrupt = off for voicemail drops: the assistant calls, leaves the recorded message even if the answering machine starts beeping, and then disconnects without the AI improvising.

9. Save the action

Click Save. The action is now linked to its trigger and fires whenever the trigger event happens.

Recipes

Mandatory regulatory disclosure on call connect

SettingValue
TriggerCall Started
Audio clipdisclosure-2026.wav (your recording)
Volume80%
Allow interruptOff
Hang up afterOff

Plays the disclosure verbatim the moment the call connects. The caller cannot talk over it; mid-clip transcripts are queued and delivered to the assistant once the disclosure finishes.

Voicemail drop with hang-up

SettingValue
TriggerTask Entered (a "leave voicemail" task)
Audio clipvoicemail-promo.mp3
Volume80–95% (recordings tend to need a slight boost over voicemail compression)
Allow interruptOff
Hang up afterOn

Plays the recorded message and disconnects. The AI never improvises — what the answering machine captures is exactly what was uploaded.

Hold music while a tool runs

SettingValue
TriggerTask Entered (a "lookup pending" task)
Audio clipshort-hold.mp3
Volume70% (slightly attenuated below speech)
Allow interruptOn
Hang up afterOff

Lets the caller barge in to ask a follow-up question. The assistant resumes the conversation as soon as the lookup completes.

Notes & limitations

  • Audio clips are workspace files, not member-specific recordings. If you need a per-member dynamic message (e.g. "Hi {{member.first_name}}…"), use Assistant Message with text instead — the assistant will TTS the message on the fly.
  • The clip's persisted message appears in the chat transcript as an audio bubble with a play button, so reviewers can hear exactly what was delivered. The bubble shows plays end-to-end / then hangs up annotations when those switches were set.
  • Replay in the web voice console plays the persisted clip (not a fresh TTS), so what you hear during replay is what the caller heard.