Respan gives you full visibility into your OpenAI API usage. Track token consumption, cost per request, latency distributions, and error rates across all your GPT-4o, GPT-4, and GPT-3.5 calls in one dashboard.
Route requests through the Respan gateway by changing your base URL, or log requests asynchronously via the request-logs endpoint. Both approaches capture token usage, cost, latency, and model information.
Configure credentials globally in the Respan dashboard or pass them per-request using the customer_credentials parameter. Multiple API keys are supported with configurable load balancing weights for redundancy.
Add your OpenAI API key in the Respan Providers dashboard, or pass credentials per-request using the customer_credentials parameter for multi-tenant applications.
Route requests through the Respan gateway using the OpenAI SDK, LangChain, Vercel AI SDK, LlamaIndex, or any OpenAI-compatible framework. Every request is logged automatically.
For async logging without the gateway, send request data to the /request-logs/create/ endpoint to track performance metrics and costs for direct OpenAI API calls.
python
import openai
# Gateway approach - change one line
client = openai.OpenAI(
api_key="YOUR_RESPAN_KEY",
base_url="https://api.keywordsai.co/api/"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={
"customer_identifier": "user-123",
"fallback_models": ["claude-3-5-sonnet-20241022"],
},
)