What data is captured for each request?

By default: the full prompt and response, model name, provider, token counts (prompt + completion), cost, latency (first token and total), HTTP status, user ID (if provided), and any custom metadata you attach. Sensitive fields can be masked or excluded at the SDK level.

Does logging add latency to my application?

No. Respan uses async background logging - the SDK sends log data after returning the response to your application. P99 ingestion latency is under 80ms and does not block the request path.

Do you sample logs or capture every request?

Respan captures 100% of requests by default - no sampling. You can configure sampling rates per environment if you prefer, but most teams capture everything to avoid gaps during debugging.

How does tracing work for multi-step agents?

The SDK uses context propagation to link spans automatically. Each LLM call within an agent run is attached to a parent trace. The resulting view shows the full execution tree with per-span latency and outputs - similar to distributed tracing in traditional APM tools.

Can I search the content of prompts and responses?

Yes. Full-text search is available across all logged content. You can also filter by structured fields - model, cost range, latency percentile, user ID, status code, or custom metadata tags.

How is this different from DataDog or Honeycomb?

General APM tools don't understand LLM semantics - they can't parse token counts, route by model capability, or show you cost-per-request. Respan is purpose-built for LLM workloads: the schema, dashboards, alerts, and search are all designed around how LLM applications behave.

Can I export logs to my own data warehouse?

Yes. The export API supports JSON and JSONL formats with configurable filters. You can stream logs to S3, BigQuery, or any destination via webhooks, or pull them on a schedule via the REST API.

LLM Tracing

End-to-end traces for every LLM call, agent step, and conversation thread - so you can debug fast, cut costs, and catch regressions before users do.

Start free View docs

Trusted in production

100%

of requests captured - no sampling

<80ms

P99 ingestion latency

Real-time

search across all requests

2 lines

of code to instrument

What you're flying blind on right now

Every team that ships LLM features without tracing eventually hits the same walls. Here's what breaks first.

✗ You can't see what your LLM is doing in production

Without structured logs, you have no record of what inputs were sent, what outputs came back, how long it took, or what it cost. Debugging is guesswork.

✗ Bad responses have no trail - just a complaint ticket

When a user reports a bad output, there's nothing to look at. No input, no context, no chain of calls. You can't reproduce, isolate, or fix what you can't see.

✗ You have no idea which features are driving LLM spend

Without per-request cost attribution, you can't tell if one user, one feature, or one model variant is responsible for your monthly bill spiking.

✗ Multi-step agent pipelines are a black box

Agents make dozens of sequential LLM calls. Without distributed tracing, you can't see where latency accumulates, where failures occur, or which step produced bad output.

✗ Regressions surface in user complaints, not dashboards

Without latency alerts and quality trend tracking, a model degradation or prompt regression goes unnoticed until users start churning.

What you get

Full tracing from the moment you add the SDK. No infrastructure to deploy, no custom schema to define.

→Log every LLM request automatically - inputs, outputs, latency, cost, model, and status
→Trace multi-step agent and chain executions end-to-end with a visual span waterfall
→Group individual calls into user sessions and full conversation threads
→Search and filter logs by content, user ID, model, cost range, latency, or status code
→Track token spend and latency over time with built-in dashboards and trend charts
→Set threshold alerts for cost spikes, latency regressions, and error rate increases
→Tag any request with custom metadata - user ID, feature flag, environment, session
→View streaming traces in real-time as they arrive - no waiting for a run to complete
→Export logs and traces via API or webhook for downstream data pipelines
→Zero-configuration auto-instrumentation for LangChain, LlamaIndex, and the OpenAI SDK

How it works

The Respan SDK wraps your existing LLM client. Every call is intercepted, logged asynchronously, and available for search and analysis in under a second. For agent workloads, spans are linked via context propagation - the full execution tree is reconstructed automatically from the trace context attached to each call.

Instrument once

Wrap your OpenAI client with the Respan SDK or point your LangChain callback to Respan. Takes two lines.

Every call is logged

Inputs, outputs, latency, cost, tokens, and custom metadata are captured asynchronously - no impact on response time.

Spans are linked

For multi-step agents, span context is propagated automatically. Respan reconstructs the full trace tree.

Search and alert

Logs are searchable in real-time. Set dashboards for cost and latency trends, and configure threshold alerts.

import openai
from respan import Logger

logger = Logger(api_key="YOUR_RESPAN_KEY")

# Option 1: Wrap the OpenAI client directly
client = logger.wrap(openai.OpenAI(api_key="sk-..."))

# Now every call is logged automatically
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "..."}],
    # Optionally tag this request
    extra_body={
        "respan_params": {
            "customer_identifier": "user-123",
            "metadata": {"feature": "document-qa"}
        }
    }
)

# Option 2: Auto-instrument LangChain
from respan.integrations.langchain import RespanCallbackHandler
callbacks = [RespanCallbackHandler(api_key="YOUR_RESPAN_KEY")]
# Pass callbacks= to any LangChain chain or agent

Who uses this and how

Debugging production

If a user reports a bad response: filter logs by user ID, find the exact request, and see the full input, output, and trace in seconds - not after a 30-minute log trawl.

Cost attribution

If you need to know why your LLM bill doubled: break down spend by user, feature, model, and environment. Find the two users driving 40% of your costs.

Agent pipelines

If you're running multi-step agents: view the full span tree to see where latency accumulates, which tool calls fail, and what data each step passed to the next.

Model migration

If you're switching from GPT-4o to a cheaper model: compare latency, cost, and output quality side-by-side using real production traffic from before and after the change.

SLA monitoring

If you have latency or error rate commitments: set up alerts on P95 latency and error rate. Get notified before users notice a degradation.

Works with your stack

Model providers

OpenAI
Anthropic
Google Gemini / Vertex AI
Groq
Mistral
Together AI
Perplexity
Fireworks
Azure OpenAI
AWS Bedrock
Any OpenAI-compatible endpoint

Frameworks

OpenAI SDK (Python + Node)
LangChain
LlamaIndex
Vercel AI SDK
Mastra
CrewAI
AutoGen
OpenAI Agents SDK

Languages

Python
TypeScript / JavaScript
Go
Ruby
Any language via REST API

Homegrown logging vs Respan Tracing

ConcernWithout RespanWith Respan

Request visibilityNothing - you see what you print to logsStructured record of every call, cost, and output

Agent debuggingPrint statements across 20+ filesFull span trace with per-step latency and outputs

Cost attributionTotal monthly bill, no breakdownPer-user, per-feature, per-model attribution

AlertingNone, or custom Slack webhook scriptsBuilt-in thresholds for cost, latency, and errors

SearchGrep through raw log filesFull-text and structured search across all requests

SetupWire up logging, storage, dashboards yourselfTwo lines of SDK code

Why not DataDog or a general APM?

General APM tools track latency and error rates, but they don't understand LLM semantics. They can't parse token counts, show cost-per-request, reconstruct an agent trace from span context, or let you search the content of a prompt. You'd have to build all of that as a custom layer on top. Respan ships it all, already wired up for LLM workloads, with zero custom schema work.

LLM Tracing

Trusted in production

What you're flying blind on right now

What you get

How it works

Who uses this and how

Works with your stack

Homegrown logging vs Respan Tracing

Why not DataDog or a general APM?

Frequently asked questions

Frequently asked questions

Also in Respan

Built for AI agents.
Break less.
Ship more.

LLM Tracing

Trusted in production

What you're flying blind on right now

What you get

How it works

Who uses this and how

Works with your stack

Homegrown logging vs Respan Tracing

Why not DataDog or a general APM?

Frequently asked questions

Frequently asked questions

Also in Respan

Built for AI agents.
Break less.
Ship more.

LLM Tracing

Trusted in production

What you're flying blind on right now

What you get

How it works

Who uses this and how

Works with your stack

Homegrown logging vs Respan Tracing

Why not DataDog or a general APM?

Frequently asked questions

Frequently asked questions

What data is captured for each request?

Does logging add latency to my application?

Do you sample logs or capture every request?

How does tracing work for multi-step agents?

Can I search the content of prompts and responses?

How is this different from DataDog or Honeycomb?

Can I export logs to my own data warehouse?

Also in Respan

Built for AI agents. Break less. Ship more.

LLM Tracing

Trusted in production

What you're flying blind on right now

What you get

How it works

Who uses this and how

Works with your stack

Homegrown logging vs Respan Tracing

Why not DataDog or a general APM?

Frequently asked questions

Frequently asked questions

What data is captured for each request?

Does logging add latency to my application?

Do you sample logs or capture every request?

How does tracing work for multi-step agents?

Can I search the content of prompts and responses?

How is this different from DataDog or Honeycomb?

Can I export logs to my own data warehouse?

Also in Respan

Built for AI agents. Break less. Ship more.

Built for AI agents.
Break less.
Ship more.

Built for AI agents.
Break less.
Ship more.