Speechbase
Product Updates

What's new in Speechbase

Every feature, improvement, and fix we ship to our speech infrastructure, all in one place.

  • billing

Per-model pricing

Pricing is now per model and shown up front across the dashboard and the pricing page, so you can compare providers without digging through anyone's rate card.

Bringing your own provider keys? You pay a flat 3.4% platform fee on what runs through the gateway, and nothing more.

  • api

Feedback API

Collect quality feedback from your users on the audio you generate. POST /v1/feedback records a score from 0 to 100 plus an optional comment for any generation, so you can see how listeners reacted, generation by generation.

Every speech and conversation response returns an x-speechbase-request-id header. Attach that ID to a rating to pin it to the exact output, giving you a qualitative read on how your application's audio is landing.

  • api

Streaming speech synthesis

Need audio to start playing before the whole clip is ready? POST /v1/audio/speech/stream streams synthesized audio straight through from the provider for the lowest possible latency.

The standard POST /v1/audio/speech now returns the complete clip in one shot, so volume and output-format options reliably apply to the finished audio.

  • billing

Plans, credits, and usage analytics

Speechbase now runs on credits. Start free with 5,000 credits, upgrade to Pro ($30/mo or $300/yr) for 30,000 credits a month, and buy one-off top-ups whenever you need more.

Billing shows your monthly allocation and top-up balance separately, and the usage charts break spend down by model over time, so you can see exactly where your credits go.

  • moderation

Content moderation

You can now screen text against configurable moderation rulesets before it's synthesized. Set an org-wide default, or pass a moderation_ruleset_id on any speech or conversation request to apply a specific policy just for that call.

The same ruleset governs single-shot synthesis and multi-turn conversations, so your safety rules stay consistent everywhere.

  • logs

Request logs

Every request now shows up in the dashboard with its provider, model, status, and credits used.

For automation, GET /v1/logs returns the same history. Filter by provider, status, and time range with cursor pagination, or fetch one request with GET /v1/logs/{id}.

  • api

Pronunciation dictionaries

Create reusable pronunciation rules that rewrite tricky words before synthesis, so names, acronyms, and product terms come out the way you intend. Manage them in the dashboard or via the /v1/pronunciation-dictionaries API.

Word-level timestamps still line up with your original text, so substitutions never throw off your captions.

  • api

Multi-speaker conversations

POST /v1/audio/conversation takes a multi-turn script, each turn with its own voice, and renders it into one mixed audio file. You can mix voices from different providers in the same conversation and choose your output format (wav, mp3, or pcm).

Great for dialogue, interviews, and any back-and-forth you'd otherwise have to stitch together yourself.

  • api

Word-level timestamps

The /with-timestamps endpoints return word-level timing alongside your audio, so you can build synced captions, karaoke-style highlighting, or anything that needs to know exactly when each word is spoken.

It works across providers. Speechbase uses native alignment where the provider offers it and fills the gap when it doesn't.

  • voices

Voice library

The /voices page now leads with trending voices from a curated catalog of 26 options across eight providers (ElevenLabs, OpenAI, Cartesia, Deepgram, Hume, Google, Inworld, and Murf), so it's easy to find a voice and hear it instantly.

Your saved voices and imports live together at /voices/library.

  • api

The Speechbase text-to-speech API

POST /v1/audio/speech gives you a single, OpenAI-style endpoint to generate speech across every supported provider. Swap models and voices without rewriting your integration.

Point it at a stored voice, a saved character, or an inline model and voice, and get the audio straight back. Bring your own provider API keys and Speechbase routes your requests through them, encrypted and isolated per organization.