End-to-end: from first API call to production monitoring

Set up Respan

Sign up — Create an account at platform.respan.ai
Create an API key — Generate one on the API keys page
Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page

Overview

This cookbook walks through the complete Respan workflow with a real example: a customer support chatbot. By the end, you’ll have:

LLM calls routed through the gateway
Prompts managed in the platform
Full tracing for every conversation
Automated evaluation on production traffic

Step 1: Set up the gateway

Point your OpenAI SDK to Respan:

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.respan.ai/api/",
5     api_key="YOUR_RESPAN_API_KEY",
6 )

Test that it works:

1 response = client.chat.completions.create(
2     model="gpt-4o-mini",
3     messages=[{"role": "user", "content": "Hello!"}],
4 )
5 print(response.choices[0].message.content)

Step 2: Create a prompt template

Go to Prompts and create a prompt:

Name: support_chatbot
System message:

You are a helpful customer support agent for Acme Corp.
Answer the customer's question based on our knowledge base.
Be concise, friendly, and accurate.
If you don't know the answer, say so honestly.

Model: gpt-4o-mini

Step 3: Use the prompt in code

Fetch the prompt at runtime so you can update it without redeploying:

1 import requests
2 
3 def get_prompt(prompt_name):
4     headers = {"Authorization": "Bearer YOUR_RESPAN_API_KEY"}
5     resp = requests.get(
6         "https://api.respan.ai/api/prompts/",
7         headers=headers,
8         params={"prompt_name": prompt_name},
9     )
10     return resp.json()
11 
12 prompt = get_prompt("support_chatbot")
13 
14 response = client.chat.completions.create(
15     model=prompt["model"],
16     messages=[
17         {"role": "system", "content": prompt["messages"][0]["content"]},
18         {"role": "user", "content": "How do I reset my password?"},
19     ],
20     extra_body={
21         "customer_identifier": "user_123",
22         "thread_identifier": "conv_abc",
23         "metadata": {"feature": "support_chat"},
24     },
25 )

Step 4: Add tracing

Wrap the chatbot logic with tracing to see the full execution flow:

1 from respan_tracing.decorators import workflow, task
2 from respan_tracing.main import RespanTelemetry
3 from respan_tracing.contexts.span import respan_span_attributes
4 
5 k_tl = RespanTelemetry()
6 
7 
8 @task(name="fetch_prompt")
9 def fetch_prompt():
10     return get_prompt("support_chatbot")
11 
12 
13 @task(name="generate_response")
14 def generate_response(prompt, user_message):
15     response = client.chat.completions.create(
16         model=prompt["model"],
17         messages=[
18             {"role": "system", "content": prompt["messages"][0]["content"]},
19             {"role": "user", "content": user_message},
20         ],
21     )
22     return response.choices[0].message.content
23 
24 
25 @workflow(name="support_chatbot")
26 def support_chatbot(user_message: str, customer_id: str, thread_id: str):
27     with respan_span_attributes(
28         respan_params={
29             "customer_identifier": customer_id,
30             "thread_identifier": thread_id,
31             "metadata": {"feature": "support_chat"},
32         }
33     ):
34         prompt = fetch_prompt()
35         answer = generate_response(prompt, user_message)
36     return answer
37 
38 
39 # Run it
40 result = support_chatbot(
41     "How do I reset my password?",
42     customer_id="user_123",
43     thread_id="conv_abc",
44 )
45 print(result)

You’ll see this trace in Respan:

support_chatbot (workflow)
├── fetch_prompt (task)
└── generate_response (task)
    └── gpt-4o-mini (LLM call)

Step 5: Create an evaluator

Go to Evaluation > Evaluators > + New evaluator:

Field	Value
Name	Support Quality
Type	LLM
Model	gpt-4o
Score type	Numerical (1-5)
Passing score	3
Definition	Rate this customer support response. Consider: (1) Does it answer the question? (2) Is the tone friendly and professional? (3) Is the information accurate? Score: 1=poor, 3=acceptable, 5=excellent.

Step 6: Run an offline experiment

Before going to production, test your prompt against sample questions:

Go to Experiments > + New experiment
Select the support_chatbot prompt
Add test cases:

user_message	ideal_output
How do I reset my password?	Go to Settings > Security > Reset Password…
What’s your refund policy?	We offer full refunds within 30 days…
My order hasn’t arrived	Let me check your order status…

Run the experiment and evaluate with the “Support Quality” evaluator
If scores are good, proceed to production

Step 7: Set up online evaluation

Create an automation to continuously evaluate production traffic:

Condition: metadata.feature = "support_chat"
Evaluator: Support Quality
Sampling rate: 0.2 (evaluate 20% of traffic)

Now every 5th support conversation is automatically scored.

Step 8: Monitor and iterate

Your production monitoring is now running. Here’s your ongoing workflow:

Check the dashboard daily — Watch cost, latency, and evaluation scores
Review low-scoring logs — Filter for conversations that scored below 3
Add failures to your test dataset — Growing your dataset makes evaluations more robust
Iterate on the prompt — Edit in the Respan playground, test with experiments
Deploy updates — Update the active prompt version, monitor the impact

The complete workflow is now a loop: production data feeds evaluation, which feeds prompt improvements, which deploy back to production.

Next steps

Workflow overview

See the full Respan workflow

A/B test prompts

Test prompt changes on a subset of users first