Speechbase
Providers

Cartesia

Cartesia Sonic text-to-speech through the Speechbase gateway, with audio tags, voice cloning, and native timestamps on sonic-3.

Prefixcartesia
Default modelsonic-3
Provider keyConnect under Provider Keys

Route to Cartesia by prefixing the model with cartesia/. sonic-3 is the flagship — fast, multilingual, and the model that supports audio tags and cloning.

Models

ModelStreamingAudio tagsVoice cloningTimestampsLanguages
sonic-3YesYesYesNative42
sonic-2YesNative1

Usage

curl -X POST https://api.speechbase.ai/v1/audio/speech \
  -H "Authorization: Bearer $SPEECHBASE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "inline",
    "model": "cartesia/sonic-3",
    "voice": "a0e99841-438c-4a64-b679-ae501e7d6091",
    "text": "Hello from Speechbase!",
    "output": "mp3"
  }' --output hello.mp3

voice is a Cartesia voice UUID.

Voice cloning

sonic-3 supports voice cloning. Clone a reference into a saved Voice, then address it by voiceId with mode: "voice" — inline requests take a plain voice string only.

Provider options

Anything in providerOptions is forwarded to the Cartesia API unchanged (for example speed or emotion controls).

On this page