Speech to text

You could use Respan's unified LLM API to call Speech-to-text model to turn audio into text. Respan now supports `whisper-1` from OpenAI. ## Integration steps: Go to [OpenAI API plaform](https://platform.openai.com/api-keys) to get your OpenAI API key. You should add your OpenAI's API key on [Respan credentials page](https://platform.respan.ai/platform/api/providers). <img height="200" width="500" src="https://keywordsai-static.s3.us-east-1.amazonaws.com/docs/settings/providers.jpg"/> ```Python OpenAI Python from openai import OpenAI client = OpenAI( base_url="https://api.respan.ai/api/", api_key="YOUR_RESPAN_API_KEY", ) audio_file= open("/path/to/file/audio.mp3", "rb") response = client.audio.transcriptions.create( model="whisper-1", file=audio_file, extra_body={"customer_identifier": "customer_11"} # All Respan parameters are supported ) ``` ```TypeScript OpenAI TypeScript const openai = new OpenAI({ baseURL: "https://api.respan.ai/api/", apiKey: "YOUR_RESPAN_API_KEY", }); const transcription = await openai.audio.transcriptions.create({ model: "whisper-1", file: fs.createReadStream("/path/to/file/audio.mp3"), extra_body: { customer_identifier: "customer_11" } // All Respan parameters are supported }); ``` ```python Python def demo_call(input, model="whisper-1", token="YOUR_RESPAN_API_KEY", ): headers = { 'Content-Type': 'application/json', 'Authorization': f'Bearer {token}', } data = { 'model': model, 'file': input, 'customer_identifier': "customer_11" } response = requests.post('https://api.respan.ai/api/audio/transcription', headers=headers, json=data) return response audio_file= open("/path/to/file/audio.mp3", "rb") print(demo_call(audio_file).json()) ``` ```TypeScript TypeScript // Define the function with TypeScript fetch('https://api.respan.ai/api/audio/transcription', { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer YOUR_RESPAN_API_KEY' }, body: JSON.stringify({ model: 'whisper-1', file: fs.createReadStream("/path/to/file/audio.mp3"), customer_identifier: "customer_11" }) }) .then(response => response.json()) .then(data => console.log(data)); ``` ## OpenAI parameters - `file` *file* **required**: The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. - `model` *string* **required**: ID of the model to use. Only `whisper-1` is currently available. - `language` *string*: The language of the input audio. Supplying the input language in [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will improve accuracy and latency. - `prompt` *string*: An optional text to guide the model's style or continue a previous audio segment. The [prompt](https://platform.openai.com/docs/guides/speech-to-text/prompting) should match the audio language. - `response_format` *string*: The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`. - `temperature` *number*: The format of the transcript output, in one of these options: `json`, `text`, `srt`, `verbose_json`, or `vtt`. - `timestamp_granularities` *array*: The timestamp granularities to populate for this transcription. response_format must be set verbose_json to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`. Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency. ## Respan parameters See how to make a standard Respan API call in the [Quick Start](/documentation/features/gateway/gateway-quickstart) guide. ### Generation parameters - `customer_credentials` *object*: You can pass in your customer's credentials for [supported providers](/integrations/overview/overview) and use their credits when our proxy is calling models from those providers. <br/>See details [here](/documentation/admin/llm-provider-keys) **Example** ```json "customer_credentials": { "openai": { "api_key": "YOUR_OPENAI_API_KEY", } } ``` - `disable_log` *boolean*: When set to true, only the request and [performance metrics](/documentation/features/monitoring/metrics) will be recorded, input and output messages will be omitted from the log. ### Observability parameters - `metadata` *dict*: You can add any key-value pair to this metadata field for your reference. Check the [details of metadata here.](/documentation/features/tracing/spans/span-fields-parameters#custom-properties) Contact team@respan.ai if you need extra parameter support for your use case. **Example** ```json { "my_value_key": "my_value" } ``` - `customer_identifier` *string*: Use this as a tag to identify the user associated with the API call. See the [details of customer identifier here.](/documentation/features/user-analytics/customer-identifier) **Example** ```json { //...other_params, "customer_identifier": "user_123" } ``` - `customer_email` *string*: This is the email address of the user associated with the API call. You can add your corresponding user's email address to the request. You could also edit customer's emails on the platform. Check the [details of user editing here.](/api-reference/observe/users/update-user) **Example** ```json { //...other_params, "customer_email": "hendrix@respan.ai" } ``` - `thread_identifier` *string*: See spans as a conversation thread. Pass all spans with the same `thread_identifier` to see them in the same thread. **Example** ```json { //...other_params, "thread_identifier": "test_thread" } ``` - `request_breakdown` *boolean*: Adding this returns the summarization of the response in the response body. If streaming is on, the metrics will be streamed as the last chunk. **Properties** ```json Regular Response { "id": "chatcmpl-7476cf3f-fcc9-4902-a548-a12489856d8a", //... main part of the response body ... "request_breakdown": { "prompt_tokens": 6, "completion_tokens": 9, "cost": 4.8e-5, "prompt_messages": [ { "role": "user", "content": "How are you doing today?" } ], "completion_message": { "content": " I'm doing well, thanks for asking!", "role": "assistant" }, "model": "claude-2", "cached": false, "timestamp": "2024-02-20T01:23:39.329729Z", "status_code": 200, "stream": false, "latency": 1.8415491580963135, "scores": {}, "category": "Questions", "metadata": {}, "routing_time": 0.18612787732854486, "full_request": { "messages": [ { "role": "user", "content": "How are you doing today?" } ], "model": "claude-2", "logprobs": true }, "sentiment_score": 0 } } ``` ```json Streaming Response //... other chunks ... // The following is the last chunk { "id": "request_breakdown", "choices": [ { "delta": { "content": null, "role": "assistant" }, "finish_reason": "stop", "request_breakdown": { "prompt_tokens": 6, "completion_tokens": 9, "cost": 4.8e-5, // In usd "prompt_messages": [ { "role": "user", "content": "How are you doing today?" } ], "completion_message": { "content": " I'm doing well, thanks for asking!", "role": "assistant" }, "model": "claude-2", "cached": false, "timestamp": "2024-02-20T01:23:39.329729Z", "status_code": 200, "stream": false, "latency": 1.8415491580963135, // in seconds "scores": {}, "category": "Questions", "metadata": {}, "routing_time": 0.18612787732854486, // in seconds "full_request": { "messages": [ { "role": "user", "content": "How are you doing today?" } ], "model": "claude-2", "logprobs": true }, "sentiment_score": 0 }, "index": 0, "message": { "content": null, "role": "assistant" } } ], "created": 1706100589, "model": "extra_parameter", "object": "chat.completion.chunk", "system_fingerprint": null, "usage": {} } ```

Authentication

AuthorizationBearer
API key authentication. Get your API key from https://platform.respan.ai/platform/api-keys

Request

This endpoint expects an object.

Response

Successful response for Speech to text
my_value_keystring

Errors

401
Unauthorized Error