Voices
Reusable voice templates that pin a provider, model, and provider voice ID — the unit Speechbase speaks with.
A voice is the smallest unit Speechbase speaks with. It bundles three things that always need to travel together:
- the provider (e.g.
elevenlabs), - the model (e.g.
eleven_v3), - the provider's voice ID (e.g.
EXAVITQu4vr4xnSDxMaLfor ElevenLabs's "Sarah"),
…plus optional metadata to make it usable across your org.
Why voices exist
Without voices, every synthesis request would have to repeat the full
provider/model/voice_id triple. With voices, you save the triple once,
give it a name, and reference it by ID afterwards:
{
"mode": "voice",
"voiceId": "01940f8a-2dc1-7000-9b6c-fc6dd8a0a4d2",
"text": "Hello!"
}That request will resolve to whichever provider/model/voice the voice points at. If you re-skin the voice (move "Sarah" from ElevenLabs to Cartesia, say), no caller has to change.
Anatomy of a voice
| Field | Notes |
|---|---|
id | Speechbase-issued UUID. Stable forever. |
org_id | Owning organisation. Voices never cross orgs. |
title | Human-readable name shown in the dashboard. |
provider | Provider ID — one of the entries in GET /v1/audio/providers. |
model | Model ID as the provider expects it (no prefix here; the row already has provider). |
voice_id | The provider's voice identifier. |
tags | Free-form array for filtering ("podcast", "narrator", "uk-english", …). |
gender | Optional — male, female, or null. |
description | Optional notes for your team. |
provider_options | JSON object passed verbatim to the provider at synthesis time. Use it for things like stability and similarity_boost on ElevenLabs. |
created_at / updated_at | ISO-8601 timestamps. |
Creating a voice
There are two production-safe ways to fill in a voice:
- From a catalog voice. Browse Voices in the dashboard; the trending panel lists voices from each provider's public catalogue. "Save" copies the provider/model/voice_id into a new voice on your org.
- From an existing provider voice. If you already created a custom voice in
a provider dashboard, import its provider-native
voice_idinto Speechbase and add the metadata your team needs.
You can also POST /v1/voices directly — useful for syncing a voice library
into Speechbase from automation.
Voice cloning
The dashboard includes a clone creation flow for collecting reference audio and
metadata. A cloned voice is only usable for synthesis after it has a
provider-native voice_id bound to the saved voice row. Until then, calls using
that voice fail with voice_incomplete.
For the practical dashboard workflow, see Voice management.
Pronunciations
Pronunciation dictionaries are workspace and request-level controls. The org default dictionary applies automatically, and callers can pass additional dictionaries or inline rules on a specific synthesis request.
Deleting a voice
DELETE /v1/voices/{id} removes the row immediately.