Generate conversation with timestamps
Synthesizes a multi-turn conversation and returns a JSON envelope with base64 audio and word-level timestamps mapped back to each originating turn. Set `split: true` to instead receive an `audioSegments` array, one entry per turn, each with block-relative timestamps.
Synthesizes a multi-turn conversation and returns a JSON envelope with base64 audio and word-level timestamps mapped back to each originating turn. Set split: true to instead receive an audioSegments array, one entry per turn, each with block-relative timestamps.
Authorization
bearerAuth API key
In: header
Request Body
application/json
TypeScript Definitions
Use the request body type in TypeScript.
Response Body
application/json
application/problem+json
application/problem+json
application/json
curl -X POST "https://example.com/v1/audio/conversation/with-timestamps" \ -H "Authorization: Bearer $SPEECHBASE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o-mini-tts", "turns": [ { "voice": "alloy", "text": "How was your weekend?" }, { "voice": "shimmer", "text": "Good, I finally caught up on sleep." } ], "gapMs": 500, "output": "mp3", "timestamps": "on"}'{
"audio": "string",
"mediaType": "string",
"warnings": [
"string"
],
"timestamps": [
{
"text": "string",
"start": 0,
"end": 0,
"turnIndex": 0
}
]
}{
"type": "string",
"title": "string",
"status": 0,
"detail": "string",
"code": "string",
"validation": [
{
"path": [
"string"
],
"message": "string"
}
],
"provider": "string",
"upstream_code": "string",
"upstream_status": 0,
"turn_index": 0
}{
"type": "string",
"title": "string",
"status": 0,
"detail": "string",
"code": "string",
"validation": [
{
"path": [
"string"
],
"message": "string"
}
],
"provider": "string",
"upstream_code": "string",
"upstream_status": 0,
"turn_index": 0
}{
"error": {
"code": "content_moderation_blocked",
"message": "string",
"reason": {
"type": "error_fail_closed"
}
}
}Generate conversation POST
Synthesizes a multi-turn conversation into a single mixed audio file and returns the raw audio bytes. Set `split: true` to instead receive a JSON envelope with an `audioSegments` array, one entry per turn.
List voices GET
Returns a cursor-paginated page of voices for the authenticated organization, ordered by creation date (newest first). Pass `next_cursor` from one response back as `cursor` on the next call until `has_more` is `false`.

