Speech to text
Transcribe audio to text through the Respan gateway with automatic logging.
Headers
Authorization
Bearer token. Use Bearer YOUR_API_KEY.
Request
This endpoint expects a multipart form containing a file.
file
Audio file. Supported: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.
model
Model ID.
Allowed values:
language
Input audio language (ISO-639-1).
prompt
Optional text to guide the model's style.
response_format
Output format.
Allowed values:
temperature
Sampling temperature (0-1).
timestamp_granularities
Timestamp granularities. Requires verbose_json response format.
Allowed values:
customer_credentials
Per-customer LLM provider credentials.
disable_log
When true, omits input/output from the log. Metrics still recorded.
metadata
Custom key-value metadata.
customer_identifier
End user identifier.
customer_email
Customer email address.
thread_identifier
Conversation thread ID.
request_breakdown
Return response metrics summary in the response body.
Response
Transcription result.
text
Transcribed text.
language
Detected language.
duration
Audio duration in seconds.
words
Word-level timestamps (if requested).
segments
Segment-level timestamps (if requested).
Errors
401
Unauthorized Error