Speechbase
Providers

Google

Google Gemini text-to-speech through the Speechbase gateway — Gemini 3.1 Flash and 2.5 Flash/Pro preview models.

Prefixgoogle
Default modelgemini-2.5-flash-preview-tts
Provider keyConnect under Provider Keys

Route to Google by prefixing the model with google/. gemini-3.1-flash-tts-preview is the newest and highest quality; the default gemini-2.5-flash-preview-tts favours latency.

Models

ModelStreamingAudio tagsTimestampsLanguages
gemini-3.1-flash-tts-previewYesYesSTT fallback78
gemini-2.5-flash-preview-ttsYesSTT fallback24
gemini-2.5-pro-preview-ttsYesSTT fallback24

Usage

curl -X POST https://api.speechbase.ai/v1/audio/speech \
  -H "Authorization: Bearer $SPEECHBASE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "inline",
    "model": "google/gemini-2.5-flash-preview-tts",
    "voice": "Kore",
    "text": "Hello from Speechbase!",
    "output": "mp3"
  }' --output hello.mp3

Built-in voices include Kore, Puck, Zephyr, Charon, Aoede, Fenrir, and others — see the Gemini TTS docs for the full list.

Output format

Gemini returns raw PCM natively. Set the output field ("wav", "mp3", or "pcm") to have the gateway transcode for you; omit it to pass the raw PCM through. See Output formats.

Provider options

Anything in providerOptions is forwarded to the Gemini API unchanged (for example temperature).

On this page