Text Generation

<Note>Make sure you add **Bearer** prefix to your token</Note> You can paste the command below into your terminal to run your first API request. Make sure to replace `YOUR_RESPAN_API_KEY` with your actual Respan API key. ## Example Call ```bash cURL curl -X POST "https://api.respan.ai/api/generate/" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer {YOUR_RESPAN_API_KEY}" \ -d '{ "messages": [ { "role": "user", "content": "Hello" } ], "models": ["gpt-3.5-turbo", "gpt-3.5-turbo-16k"], "stream": false, "max_tokens": 100 }' ``` ```python Python def demo_call(input, model="claude-2" , token="a4EUZEcl.RmrDVwbTI8yOFZNuKwSwYnrdCc03Qn6Z", stream=False ): headers = { 'Content-Type': 'application/json', 'Authorization': f'Bearer {token}', } data = { 'model': model, 'messages': [{'role': 'user', 'content': input}], "stream": stream, "models": ["gpt-3.5-turbo", "gpt-3.5-turbo-16k"], } response = requests.post('https://api.respan.ai/api/generate/', headers=headers, json=data, stream=stream) return response messages = "Say 'Hello World'" print(demo_call(messages).json()) ``` ```TypeScript TypeScript // Define the function with TypeScript fetch('https://api.respan.ai/api/generate/', { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer a4EUZEcl.RmrDVwbTI8yOFZNuKwSwYnrdCc03Qn6Z' } body: JSON.stringify({ models: ['gpt-3.5-turbo', 'gpt-3.5-turbo-16k'], messages: [{role: 'user', content: "Say 'Hello World'"}] }) }) .then(response => response.json()) .then(data => console.log(data)); ``` ## OpenAI Parameters - `model` *string*: Specify which model to use. See the list of model [here](/integrations/overview/overview) - `messages` *array* **required**: List of messages to send to the endpoint in the [OpenAI style](https://platform.openai.com/docs/api-reference/chat/create#chat-create-messages), each of them following this format: ```json { "role": "user", // Available choices are user, system or assistant "content": "Hello?" } ``` <strong>Image Processing</strong> If you want to use the image processing feature, you need to use the following format to upload the image ```json { "role": "user", "content": [ { "type": "text", "text": "What’s in this image?" }, { "type": "image_url", "image_url": { "url": "https://as1.ftcdn.net/v2/jpg/01/34/53/74/1000_F_134537443_VendrqyXIWyHrZgxdIsfyKUost734JDP.jpg" } } ] } ``` - `max_tokens` *number*: Maximum number of tokens to generate in the response - `temperature` *number*: Controls randomness in the output in the range of 0-2, higher temperature will a more random response. - `n` *number*: Specify many completion choices to generate for each prompt. <strong>Caveat!</strong> While this can help improve generation quality by picking the optimal choice, this could also lead to more token usage. - `stream` *boolean*: Whether to stream back partial progress token by token {/* Add this to the concept page */} - `logprobs` *boolean*: Include the log probabilities on each token being selected. - `echo` *boolean*: Echo back the prompt in addition to the completion {/* Add this to the concept page */} - `stop` *array[string]*: Stop sequence {/* Add this to the concept page */} - `presence_penalty` *number*: Specify how much to penalize new tokens based on whether they appear in the text so far. Increases the model's likelihood of talking about new topics {/* Add this to the concept page */} - `frequency_penalty` *number*: Specify how much to penalize new tokens based on their existing frequency in the text so far. Decreases the model's likelihood of repeating the same line verbatim {/* Add this to the concept page */} - `logit_bias` *dict*: Used to modify the probability of tokens appearing in the response - `tools` *array[dict]*: A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide an array of functions the model may generate JSON inputs for. ```json { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ``` - `tool_choice` *dict*: Manually picking the choice of tool for the model to use. This will force the model to make a function call every time a function is passed in. ```json { "type": "function", "function": {"name": "name_of_the_function"}, } ``` ## Respan Parameters - `request_breakdown` *boolean*: Adding this returns the summarization of the response in the response body. If streaming is on, the metrics will be streamed as the last chunk <strong>Regular Response</strong> ```json { "id": "chatcmpl-7476cf3f-fcc9-4902-a548-a12489856d8a", //... main part of the response body ... "request_breakdown": { "prompt_tokens": 6, "completion_tokens": 9, "cost": 4.8e-5, "prompt_messages": [ { "role": "user", "content": "How are you doing today?" } ], "completion_message": { "content": " I'm doing well, thanks for asking!", "role": "assistant" }, "model": "claude-2", "cached": false, "timestamp": "2024-02-20T01:23:39.329729Z", "status_code": 200, "stream": false, "latency": 1.8415491580963135, "scores": {}, "category": "Questions", "metadata": {}, "routing_time": 0.18612787732854486, "full_request": { "messages": [ { "role": "user", "content": "How are you doing today?" } ], "model": "claude-2", "logprobs": true }, "sentiment_score": 0 } } ``` <strong>Streaming Response</strong> ```json //... other chunks ... // The following is the last chunk { "id": "request_breakdown", "choices": [ { "delta": { "content": null, "role": "assistant" }, "finish_reason": "stop", "request_breakdown": { "prompt_tokens": 6, "completion_tokens": 9, "cost": 4.8e-5, // In usd "prompt_messages": [ { "role": "user", "content": "How are you doing today?" } ], "completion_message": { "content": " I'm doing well, thanks for asking!", "role": "assistant" }, "model": "claude-2", "cached": false, "timestamp": "2024-02-20T01:23:39.329729Z", "status_code": 200, "stream": false, "latency": 1.8415491580963135, // in seconds "scores": {}, "category": "Questions", "metadata": {}, "routing_time": 0.18612787732854486, // in seconds "full_request": { "messages": [ { "role": "user", "content": "How are you doing today?" } ], "model": "claude-2", "logprobs": true }, "sentiment_score": 0 }, "index": 0, "message": { "content": null, "role": "assistant" } } ], "created": 1706100589, "model": "extra_parameter", "object": "chat.completion.chunk", "system_fingerprint": null, "usage": {} } ``` - `models` *array*: Specify the list of models for the router to choose between. If not specified,{" "} <strong>all models</strong> will be used. See the list of models [here](/integrations/overview/overview) If only one model is specified, it will be treated as if `model` parameter is used and the router will not trigger. - `fallback_models` *array*: Specify the list of backup models (ranked by priority) to respond in case of a failure in the primary model. See the list of models [here](/integrations/overview/overview) - `customer_credentials` *object*: You can pass in a dictionary of your customer's credentials and deployment variables for [supported providers](/integrations/overview/overview) and use their credits when the router is calling models from those providers. ```json { "openai": { "api_key": "sk-DEacrWTDndFYhdcLYKF6T3BlbkdfghuyTj4sYL2v1EDhg3iz5", "some_vars": "some_values" }, "provider_id": { "some_provider_var_names": "some_provider_var_values" } } ``` - `customer_identifier` *string*: Use this as a tag to identify the user associated with the API call. - `metadata` *dict*: You can add any key-value pair to this metadata field for your reference, contact team@respan.ai if you need extra parameter support for your use case. Example: ```json { "my_value_key": "my_value" } ``` - `disable_log` *boolean*: When set to true, only the request and the [performance metrics](/documentation/features/monitoring/metrics) will be recorded, input and output messages will be omitted from the log. - `exclude_models` *array*: The list of models to exclude from the router's selection. See the list of models [here](/integrations/overview/overview) This only excludes models in the router, if `model` parameter will take precedence over this parameter, and`fallback_models` and [safety net](/documentation/features/monitoring/notifications/subscribe-alerts) will still use the excluded models to catch failures. - `exclude_providers` *array*: The list of providers to exclude from the router's selection. All models under the provider will be excluded. See the list of providers [here](/integrations/overview/overview) This only excludes models in the router, if `model` parameter will take precedence over this parameter, and`fallback_models` and [safety net](/documentation/features/monitoring/notifications/subscribe-alerts) will still use the excluded models to catch failures. ## Deprecated Parameters - `customer_api_keys` *object*: You can pass in a dictionary of your customer's API keys for specific models. If the router selects a model that is in the dictionary, it will attempt to use the customer's API key for calling the model before using your [integration API key](/documentation/admin/llm-provider-keys) or Respan's default API key. ```json { "gpt-3.5-turbo": "your_customer_api_key", "gpt-4": "your_customer_api_key", } ```

Authentication

AuthorizationBearer
API key authentication. Get your API key from https://platform.respan.ai/platform/api-keys

Request

This endpoint expects an object.

Response

Successful response for Text Generation
rolestring
contentlist of objects

Errors

401
Unauthorized Error