Speechbase

Audio Playground

Compare models and voices, stream audio, tune loudness, save voices, and copy integration code.

The Audio Playground is the fastest way to test providers before you commit to an API request in production. It uses the same Speechbase gateway as your app, so provider keys, enablement, logs, moderation, and saved voices behave the same way they do from code.

Open it from Audio Playground in the dashboard.

Compare models side by side

The playground starts with two model panels. You can add up to four panels and run the same input text against every selected provider/model/voice combination.

Each panel shows:

  • model and provider,
  • selected voice,
  • generation status,
  • audio playback,
  • generation time,
  • streaming time-to-first-byte when streaming is enabled.

Use this to compare quality, latency, cost tradeoffs, audio tag behavior, and voice fit before changing your application code.

Choose voices

Each panel has a voice selector with two groups:

GroupWhat it contains
Your voicesVoices saved to your Speechbase workspace for the selected model.
BrowseCurated provider catalog voices you can audition without saving first.

When you pick a browsed voice, the playground marks it as unsaved and lets you save it to your voice library. Saved voices can then be referenced by stable Speechbase voice IDs in production requests.

Stream or buffer

Open the settings menu to choose whether the playground should stream audio.

When streaming is on, providers that support streaming return audio chunks as they generate them. The panel reports time-to-first-byte and received bytes as the stream arrives. Providers without streaming support fall back to buffered generation.

When streaming is off, the playground waits for the full audio result. This is the mode to use when you want Speechbase to apply target loudness normalization before playback.

Target loudness

The playground can normalize generated audio to a target dBFS value when streaming is off. This mirrors the volumeDbfs option available on synthesis and conversation calls.

Use this when comparing providers that return audio at noticeably different levels. For multi-speaker production audio, the same control is usually more important on conversation requests, where multiple turns get stitched into one file.

Generate sample text

The Surprise me button asks Speechbase to generate sample text for testing. It is useful when you want quick material for a provider comparison, especially for audio tags such as [laughs], [sighs], or [whispers] on models that support them.

Copy integration code

Each panel can show a TypeScript snippet for the selected model, voice, text, and loudness settings. Use it as a starting point, then decide whether the production version should address the voice inline or through a saved voice.

History

The playground stores recent local comparison runs in your browser. History is for developer workflow only; production observability lives in request logs, which record operational metadata for gateway requests without storing input text or output audio.

On this page