Create span

This guide shows you how to create any type of LLM span in Respan using the **universal input/output design** that supports all span types. <Warning> **Span size limit: 20MB** Each span payload has a maximum size limit of 20MB. This includes the `input`, `output`, and all other fields combined. Spans exceeding this limit will be rejected. </Warning> ## Input/Output **Respan uses universal `input` and `output` fields across all span types.** - **Chat completions**: Messages arrays - **Embeddings**: Text strings or arrays - **Transcriptions**: Audio metadata → text - **Speech**: Text → audio - **Workflows/Tasks**: Any custom data structure - **Agent operations**: Complex nested objects **How it works:** 1. You provide `input` and `output` fields in any structure (string, object, array, etc.) 2. Set `log_type` to indicate span type (`"chat"`, `"embedding"`, `"workflow"`, etc.) 3. Respan automatically extracts type-specific fields for backward compatibility 4. Your data is stored efficiently and retrieved with both universal and type-specific fields For complete `log_type` specifications, see [log types](/documentation/resources/reference/data-model#log-types). <Note> ### Legacy field support For backward compatibility, Respan still supports legacy fields: - `prompt_messages` *array*: **Legacy field.** Use `input` instead. - `completion_message` *object*: **Legacy field.** Use `output` instead. </Note> ## Request body ### Core fields - `input` *string | object | array*: Universal input field for the span. Structure depends on `log_type`: - **Chat**: JSON string of messages array or messages array directly - **Embedding**: Text string or array of strings - **Workflow/Task**: Any JSON-serializable structure - **Transcription**: Audio file reference or metadata object - **Speech**: Text string or TTS configuration object See the **Span Types** section below for complete specifications. **Example for Chat** ```json "input": "[{\"role\":\"system\",\"content\":\"You are helpful.\"},{\"role\":\"user\",\"content\":\"Hello\"}]" ``` **Example for Embedding** ```json "input": "Respan is an LLM observability platform" ``` **Example for Workflow** ```json "input": "{\"query\":\"Help with order #12345\",\"context\":{\"user_id\":\"123\"}}" ``` - `output` *string | object | array*: Universal output field for the span. Structure depends on `log_type`: - **Chat**: JSON string of completion message or message object directly - **Embedding**: Array of vector embeddings - **Workflow/Task**: Any JSON-serializable result structure - **Transcription**: Transcribed text string - **Speech**: Audio file reference or base64 audio data **Example for Chat** ```json "output": "{\"role\":\"assistant\",\"content\":\"Hello! How can I help you?\"}" ``` **Example for Embedding** ```json "output": "[0.123, -0.456, 0.789, ...]" ``` - `log_type` *string*: Type of span being logged. Determines how `input` and `output` are parsed. **Supported types:** - `"chat"` - Chat completion requests (default) - `"completion"` - Legacy completion requests - `"response"` - OpenAI Response API - `"embedding"` - Embedding generation - `"transcription"` - Speech-to-text - `"speech"` - Text-to-speech - `"workflow"` or `"agent"` - Workflow/agent execution - `"task"` or `"tool"` - Task/tool execution - `"function"` - Function call - `"generation"` - Generation span - `"handoff"` - Agent handoff - `"guardrail"` - Safety check - `"custom"` - Custom span type **Default Behavior** If not specified, defaults to `"chat"`. For chat types, the system automatically extracts `prompt_messages` and `completion_message` from `input` and `output` for backward compatibility. For complete specifications of each type, see [log types](/documentation/resources/reference/data-model#log-types). - `model` *string*: The model used for the inference. Optional but recommended for chat/completion/embedding types. **Example** ```json "model": "gpt-4o-mini" ``` ### Telemetry Performance metrics and cost tracking for monitoring LLM efficiency. - `usage` *object*: Token usage information for the request. **Properties** - `prompt_tokens` *integer*: Number of tokens in the prompt/input. - `completion_tokens` *integer*: Number of tokens in the completion/output. - `total_tokens` *integer*: Total tokens (prompt + completion). - `prompt_tokens_details` *object*: Detailed breakdown of prompt tokens (e.g., cached tokens). - `cache_creation_prompt_tokens` *integer*: For Anthropic models: tokens used to create the cache. **Example** ```json { "usage": { "prompt_tokens": 150, "completion_tokens": 85, "total_tokens": 235, "prompt_tokens_details": { "cached_tokens": 10 } } } ``` - `cost` *float*: Cost of the inference in US dollars. If not provided, will be calculated automatically based on model pricing. - `latency` *float*: Total request latency in seconds (replaces deprecated `generation_time`). <Note> Previously called `generation_time`. For backward compatibility, both field names are supported. </Note> - `time_to_first_token` *float*: Time to first token (TTFT) in seconds. Useful for streaming responses and voice AI applications. <Note> Previously called `ttft`. Both field names are supported. </Note> - `tokens_per_second` *float*: Generation speed in tokens per second. ### Metadata Custom tracking and identification parameters for advanced analytics and filtering. - `metadata` *object*: You can add any key-value pair to this metadata field for your reference. Useful for custom analytics and filtering. **Example** ```json { "metadata": { "language": "en", "environment": "production", "version": "v1.0.0", "feature": "chat_support", "user_tier": "premium" } } ``` - `customer_identifier` *string*: An identifier for the customer that invoked this request. Helps with visualizing user activities. See [customer identifier details](/documentation/features/user-analytics/customer-identifier). **Example** ```json "customer_identifier": "user_123" ``` - `customer_params` *object*: Extended customer information (alternative to individual customer fields). **Properties** - `customer_identifier` *string*: Customer identifier. - `name` *string*: Customer name. - `email` *string*: Customer email. **Example** ```json { "customer_params": { "customer_identifier": "customer_123", "name": "John Doe", "email": "john.doe@example.com" } } ``` - `thread_identifier` *string*: A unique identifier for the conversation thread. Useful for multi-turn conversations. - `custom_identifier` *string*: Same functionality as `metadata`, but indexed for faster querying. **Example** ```json "custom_identifier": "ticket_12345" ``` - `group_identifier` *string*: Group identifier. Use to group related spans together. ### Workflow & tracing Parameters for distributed tracing and workflow tracking. - `trace_unique_id` *string*: Unique identifier for the trace. Used to link multiple spans together in distributed tracing. - `span_workflow_name` *string*: Name of the workflow this span belongs to. - `span_name` *string*: Name of this specific span/task within the workflow. - `span_parent_id` *string*: ID of the parent span. Used to build the trace hierarchy. ## Advanced parameters ### Tool calls and function calling - `tools` *array*: A list of tools the model may call. Currently, only functions are supported as a tool. **Properties** - `type` *string* **required**: The type of the tool. Currently, only `function` is supported. - `function` *object* **required**: **Properties** - `name` *string* **required**: The name of the function. - `description` *string*: A description of what the function does. - `parameters` *object*: The parameters the function accepts. **Example** ```json "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["location"] } } } ] ``` - `tool_choice` *string | object*: Controls which (if any) tool is called by the model. Can be `"none"`, `"auto"`, or an object specifying a specific tool. **Example** ```json "tool_choice": { "type": "function", "function": { "name": "get_current_weather" } } ``` ### Response configuration - `response_format` *object*: Setting to `{ "type": "json_schema", "json_schema": {...} }` enables Structured Outputs. **Possible types** - **Text**: `{ "type": "text" }` - Default response format - **JSON Schema**: `{ "type": "json_schema", "json_schema": {...} }` - Structured outputs - **JSON Object**: `{ "type": "json_object" }` - Legacy JSON format ### Model configuration - `temperature` *number*: Controls randomness in the output (0-2). Higher values produce more random responses. - `top_p` *number*: Nucleus sampling parameter. Alternative to temperature. - `frequency_penalty` *number*: Penalizes tokens based on their frequency in the text so far. - `presence_penalty` *number*: Penalizes tokens based on whether they appear in the text so far. - `max_tokens` *integer*: Maximum number of tokens to generate. - `stop` *array[string]*: Stop sequences where generation will stop. ### Error handling and status - `status_code` *integer*: The HTTP status code for the request. Default is 200 (success). **Supported status codes** All valid HTTP status codes are supported: `200`, `201`, `400`, `401`, `403`, `404`, `429`, `500`, `502`, `503`, `504`, etc. - `error_message` *string*: Error message if the request failed. Default is empty string. - `warnings` *string | object*: Any warnings that occurred during the request. - `status` *string*: Request status. Common values: `"success"`, `"error"`. ### Additional configuration - `stream` *boolean*: Whether the response was streamed. - `prompt_id` *string*: ID of the prompt template used. See [Prompts documentation](/documentation/features/prompt-management/manage-prompts). - `prompt_name` *string*: Name of the prompt template. - `is_custom_prompt` *boolean*: Whether the prompt is a custom prompt. Set to `true` if using custom `prompt_id`. - `timestamp` *string*: ISO 8601 timestamp when the request completed. **Example** ```json "timestamp": "2025-01-01T10:30:00Z" ``` - `start_time` *string*: ISO 8601 timestamp when the request started. - `full_request` *object*: The full request object. Useful for logging additional configuration parameters. <Note> Tool calls and other nested objects will be automatically extracted from `full_request`. </Note> - `full_response` *object*: The full response object from the model provider. ### Pricing configuration - `prompt_unit_price` *number*: Custom price per 1M prompt tokens. Used for self-hosted or fine-tuned models. **Example** ```json "prompt_unit_price": 0.0042 // $0.0042 per 1M tokens ``` - `completion_unit_price` *number*: Custom price per 1M completion tokens. Used for self-hosted or fine-tuned models. **Example** ```json "completion_unit_price": 0.0042 // $0.0042 per 1M tokens ``` ### API controls - `respan_api_controls` *object*: Control the behavior of the Respan logging API. **Properties** - `block` *boolean*: If `false`, the server immediately returns initialization status without waiting for log completion. **Example** ```json { "respan_api_controls": { "block": true } } ``` - `positive_feedback` *boolean*: Whether the user liked the output. `true` means positive feedback.

Authentication

AuthorizationBearer
API key authentication. Get your API key from https://platform.respan.ai/platform/api-keys

Request

This endpoint expects an object.
prompt_messageslist of stringsOptional
Legacy field. Use input instead.
completion_messageobjectOptional
Legacy field. Use output instead.
inputstring or map from strings to any or list of stringsOptional

Universal input field for the span. Structure depends on log_type: - Chat: JSON string of messages array or messages array directly - Embedding: Text string or array of strings - Workflow/Task: Any JSON-serializable structure - Transcription: Audio file reference or metadata object - Speech: Text string or TTS configuration object See the Span Types section below for complete specifications.

outputstring or map from strings to any or list of stringsOptional

Universal output field for the span. Structure depends on log_type: - Chat: JSON string of completion message or message object directly - Embedding: Array of vector embeddings - Workflow/Task: Any JSON-serializable result structure - Transcription: Transcribed text string - Speech: Audio file reference or base64 audio data

log_typestringOptionalDefaults to chat
Type of span being logged. Determines how input and output are parsed. Supported types: - "chat" - Chat completion requests (default) - "completion" - Legacy completion requests - "response" - OpenAI Response API - "embedding" - Embedding generation - "transcription" - Speech-to-text - "speech" - Text-to-speech - "workflow" or "agent" - Workflow/agent execution - "task" or "tool" - Task/tool execution - "function" - Function call - "generation" - Generation span - "handoff" - Agent handoff - "guardrail" - Safety check - "custom" - Custom span type
modelstringOptional

The model used for the inference. Optional but recommended for chat/completion/embedding types.

usageobjectOptional
Token usage information for the request.
costdoubleOptional
Cost of the inference in US dollars. If not provided, will be calculated automatically based on model pricing.
latencydoubleOptional

Total request latency in seconds (replaces deprecated generation_time).

time_to_first_tokendoubleOptional

Time to first token (TTFT) in seconds. Useful for streaming responses and voice AI applications.

tokens_per_seconddoubleOptional
Generation speed in tokens per second.
metadataobjectOptional

You can add any key-value pair to this metadata field for your reference. Useful for custom analytics and filtering.

customer_identifierstringOptional<=254 characters

An identifier for the customer that invoked this request. Helps with visualizing user activities. See customer identifier details. Max 254 characters; auto-truncated if exceeded.

customer_paramsobjectOptional

Extended customer information (alternative to individual customer fields).

thread_identifierstringOptional

A unique identifier for the conversation thread. Useful for multi-turn conversations.

custom_identifierstringOptional
Same functionality as metadata, but indexed for faster querying.
group_identifierstringOptional
Group identifier. Use to group related spans together.
trace_unique_idstringOptional
Unique identifier for the trace. Used to link multiple spans together in distributed tracing.
span_workflow_namestringOptional
Name of the workflow this span belongs to.
span_namestringOptional

Name of this specific span/task within the workflow.

span_parent_idstringOptional
ID of the parent span. Used to build the trace hierarchy.
toolslist of stringsOptional
A list of tools the model may call. Currently, only functions are supported as a tool.
tool_choicestring or map from strings to anyOptional

Controls which (if any) tool is called by the model. Can be “none”, “auto”, or an object specifying a specific tool.

response_formatobjectOptional

Setting to { “type”: “json_schema”, “json_schema”: {…} } enables Structured Outputs.

temperaturedoubleOptionalDefaults to 1

Controls randomness in the output (0-2). Higher values produce more random responses.

top_pdoubleOptionalDefaults to 1
Nucleus sampling parameter. Alternative to temperature.
frequency_penaltydoubleOptional
Penalizes tokens based on their frequency in the text so far.
presence_penaltydoubleOptional
Penalizes tokens based on whether they appear in the text so far.
max_tokensintegerOptional
Maximum number of tokens to generate.
stoplist of stringsOptional
Stop sequences where generation will stop.
status_codeintegerOptionalDefaults to 200

The HTTP status code for the request. Default is 200 (success).

error_messagestringOptional
Error message if the request failed. Default is empty string.
warningsstring or map from strings to anyOptional
Any warnings that occurred during the request.
statusstringOptional

Request status. Common values: “success”, “error”.

streambooleanOptional
Whether the response was streamed.
prompt_idstringOptional
ID of the prompt template used. See Prompts documentation.
prompt_namestringOptional
Name of the prompt template.
is_custom_promptbooleanOptional

Whether the prompt is a custom prompt. Set to true if using custom prompt_id.

timestampstringOptional
ISO 8601 timestamp when the request completed.
start_timestringOptional
ISO 8601 timestamp when the request started.
full_requestobjectOptional
The full request object. Useful for logging additional configuration parameters.
full_responseobjectOptional
The full response object from the model provider.
prompt_unit_pricedoubleOptional

Custom price per 1M prompt tokens. Used for self-hosted or fine-tuned models.

completion_unit_pricedoubleOptional

Custom price per 1M completion tokens. Used for self-hosted or fine-tuned models.

respan_api_controlsobjectOptional
Control the behavior of the Respan logging API.
positive_feedbackbooleanOptional
Whether the user liked the output. true means positive feedback.

Response

Successful response for Create span
usageobject

Errors

401
Unauthorized Error