Voices

Reusable voice templates that pin a provider, model, and provider voice ID — the unit Speechbase speaks with.

A voice is the smallest unit Speechbase speaks with. It bundles three things that always need to travel together:

the provider (e.g. elevenlabs),
the model (e.g. eleven_v3),
the provider's voice ID (e.g. EXAVITQu4vr4xnSDxMaL for ElevenLabs's "Sarah"),

…plus optional metadata to make it usable across your org.

Why voices exist

Without voices, every synthesis request would have to repeat the full provider/model/voice_id triple. With voices, you save the triple once, give it a name, and reference it by ID afterwards:

{
  "mode": "voice",
  "voiceId": "01940f8a-2dc1-7000-9b6c-fc6dd8a0a4d2",
  "text": "Hello!"
}

That request will resolve to whichever provider/model/voice the voice points at. If you re-skin the voice (move "Sarah" from ElevenLabs to Cartesia, say), no caller has to change.

Anatomy of a voice

Field	Notes
`id`	Speechbase-issued UUID. Stable forever.
`org_id`	Owning organisation. Voices never cross orgs.
`title`	Human-readable name shown in the dashboard.
`provider`	Provider ID — one of the entries in `GET /v1/audio/providers`.
`model`	Model ID as the provider expects it (no prefix here; the row already has `provider`).
`voice_id`	The provider's voice identifier.
`tags`	Free-form array for filtering ("podcast", "narrator", "uk-english", …).
`gender`	Optional — `male`, `female`, or `null`.
`description`	Optional notes for your team.
`provider_options`	JSON object passed verbatim to the provider at synthesis time. Use it for things like `stability` and `similarity_boost` on ElevenLabs.
`created_at` / `updated_at`	ISO-8601 timestamps.

Creating a voice

There are two production-safe ways to fill in a voice:

From a catalog voice. Browse Voices in the dashboard; the trending panel lists voices from each provider's public catalogue. "Save" copies the provider/model/voice_id into a new voice on your org.
From an existing provider voice. If you already created a custom voice in a provider dashboard, import its provider-native voice_id into Speechbase and add the metadata your team needs.

You can also POST /v1/voices directly — useful for syncing a voice library into Speechbase from automation.

The dashboard includes a clone creation flow for collecting reference audio and metadata. A cloned voice is only usable for synthesis after it has a provider-native voice_id bound to the saved voice row. Until then, calls using that voice fail with voice_incomplete.

For the practical dashboard workflow, see Voice management.

Pronunciations

Pronunciation dictionaries are workspace and request-level controls. The org default dictionary applies automatically, and callers can pass additional dictionaries or inline rules on a specific synthesis request.

Deleting a voice

DELETE /v1/voices/{id} removes the row immediately.

Why voices exist

Anatomy of a voice

Creating a voice

Voice cloning

Pronunciations

Deleting a voice

On this page