Monitor OpenAI API usage with Respan

Overview

Respan gives you full visibility into your OpenAI API usage. Track token consumption, cost per request, latency distributions, and error rates across all your GPT-4o, GPT-4, and GPT-3.5 calls in one dashboard.

Route requests through the Respan gateway by changing your base URL, or log requests asynchronously via the request-logs endpoint. Both approaches capture token usage, cost, latency, and model information.

Configure credentials globally in the Respan dashboard or pass them per-request using the customer_credentials parameter. Multiple API keys are supported with configurable load balancing weights for redundancy.

How it works

Add your OpenAI API key in the Respan Providers dashboard, or pass credentials per-request using the customer_credentials parameter for multi-tenant applications.

Route requests through the Respan gateway using the OpenAI SDK, LangChain, Vercel AI SDK, LlamaIndex, or any OpenAI-compatible framework. Every request is logged automatically.

For async logging without the gateway, send request data to the /request-logs/create/ endpoint to track performance metrics and costs for direct OpenAI API calls.

python

import openai

# Gateway approach - change one line
client = openai.OpenAI(
    api_key="YOUR_RESPAN_KEY",
    base_url="https://api.keywordsai.co/api/"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_body={
        "customer_identifier": "user-123",
        "fallback_models": ["claude-3-5-sonnet-20241022"],
    },
)