Speechbase: Universal Text-to-Speech Gateway & Voice Management

Google Gemini text-to-speech through the Speechbase gateway — Gemini 3.1 Flash and 2.5 Flash/Pro preview models.


Prefix	`google`
Default model	`gemini-2.5-flash-preview-tts`
Provider key	Connect under Provider Keys

Route to Google by prefixing the model with google/. gemini-3.1-flash-tts-preview is the newest and highest quality; the default gemini-2.5-flash-preview-tts favours latency.

Models

Model	Streaming	Audio tags	Timestamps	Languages
`gemini-3.1-flash-tts-preview`	Yes	Yes	Gateway-generated	78
`gemini-2.5-flash-preview-tts`	Yes	—	Gateway-generated	24
`gemini-2.5-pro-preview-tts`	Yes	—	Gateway-generated	24

Usage

curl -X POST https://api.speechbase.ai/v1/audio/speech \
  -H "Authorization: Bearer $SPEECHBASE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "mode": "inline",
    "model": "google/gemini-2.5-flash-preview-tts",
    "voice": "Kore",
    "text": "Hello from Speechbase!",
    "output": "mp3"
  }' --output hello.mp3

Built-in voices include Kore, Puck, Zephyr, Charon, Aoede, Fenrir, and others — see the Gemini TTS docs for the full list.

Output format

Gemini returns raw PCM natively. Set the output field ("wav", "mp3", or "pcm") to have the gateway transcode for you; omit it to pass the raw PCM through. See Output formats.

Provider options

Anything in providerOptions is forwarded to the Gemini API unchanged (for example temperature).

Google

Models

Usage

Output format

Provider options

On this page