Skip to main content
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page

What is BAML?

BAML is a domain-specific language for building reliable LLM functions with structured outputs. This guide shows how to set up Respan with BAML for both tracing and gateway routing.
Import the instrumentation module before any other module imports baml_client.b to ensure all functions are decorated.

Setup

1

Set environment variables

.env
RESPAN_API_KEY=your-respan-api-key
RESPAN_BASE_URL=https://api.respan.ai
2

Create the instrumentation module

Create a module (e.g. respan_baml_tracing.py) that decorates all callable BAML client functions. Import this module first in your app so every BAML call is traced.
respan_baml_tracing.py
from collections.abc import Callable
from functools import wraps

from baml_client import b
from baml_py import Collector
from respan_tracing.decorators import task
from opentelemetry.semconv_ai import LLMRequestTypeValues, SpanAttributes
from servicelib.errors import logger
from servicelib.globals import kai_client

@task(name="baml_usage")
async def update_span_with_baml_usage(baml_function: Callable, *args, **kwargs):
    baml_collector = Collector(name="baml_usage_collector")
    kwargs.setdefault("baml_options", {})
    kwargs["baml_options"]["collector"] = baml_collector

    result = await baml_function(*args, **kwargs)

    input_tokens = baml_collector.last.usage.input_tokens if baml_collector.last and baml_collector.last.usage.input_tokens else 0
    output_tokens = baml_collector.last.usage.output_tokens if baml_collector.last and baml_collector.last.usage.output_tokens else 0

    baml_model_name = (
        baml_collector.last.selected_call.client_name
        if baml_collector.last and baml_collector.last.selected_call
        else None
    )

    attributes = {
        SpanAttributes.LLM_USAGE_PROMPT_TOKENS: input_tokens,
        SpanAttributes.LLM_USAGE_COMPLETION_TOKENS: output_tokens,
        SpanAttributes.LLM_USAGE_TOTAL_TOKENS: input_tokens + output_tokens,
        SpanAttributes.TRACELOOP_SPAN_KIND: LLMRequestTypeValues.CHAT.value,
    }
    if baml_model_name:
        attributes[SpanAttributes.LLM_REQUEST_MODEL] = baml_model_name
        attributes[SpanAttributes.LLM_RESPONSE_MODEL] = baml_model_name

    kai_client.update_current_span(
        respan_params={"metadata": {"example": "baml"}},
        attributes=attributes,
        name="baml_usage.chat",
    )
    return result


def respan_baml_trace(func: Callable):
    @wraps(func)
    async def wrapper(*args, **kwargs):
        return await update_span_with_baml_usage(func, *args, **kwargs)
    return wrapper


def initialize_respan_baml_tracing() -> tuple[int, int]:
    """Apply respan_baml_trace decorator to all callable BAML functions."""
    decorated_count = 0
    found_count = 0
    logger.info("[BAML Tracing] Decorating BAML client with tracing...")
    for attr_name in dir(b):
        if not attr_name.startswith("_"):
            attr = getattr(b, attr_name)
            if callable(attr):
                found_count += 1
                setattr(b, attr_name, respan_baml_trace(attr))
                decorated_count += 1
    logger.info(f"[BAML Tracing] Decorated {decorated_count} / {found_count} BAML functions")
    return decorated_count, found_count

# Initialize immediately when this module is imported
_decorated_count, _found_count = initialize_respan_baml_tracing()

# Export the decorated BAML client for other modules to use
__all__ = ["b"]
3

Use the decorated BAML client

Import the decorated b from your instrumentation module and use it like your normal BAML client:
app.py
from respan_baml_tracing import b

async def main():
    result = await b.some_function(arg1="value")
    print(result)
4

View your trace

Open the Traces page in the Respan dashboard.

Attributes

Attach Respan-specific parameters to your traces via respan_params in the tracing decorator or via X-Data-Respan-Params header when using the gateway.

Via tracing SDK

kai_client.update_current_span(
    respan_params={
        "customer_identifier": "user_123",
        "metadata": {"session_id": "abc123"},
    },
)

Via gateway header

import base64, json

respan_params = {
    "customer_identifier": "user_123",
    "thread_id": "conversation_456",
    "metadata": {"session_id": "abc123"},
}
encoded = base64.b64encode(json.dumps(respan_params).encode("utf-8")).decode("utf-8")

# Pass as header: {"X-Data-Respan-Params": encoded}
AttributeDescription
customer_identifierCustomer or user identifier
thread_idThread or conversation identifier
metadataCustom key-value pairs attached to the trace

Gateway

Route BAML LLM calls through the Respan gateway for automatic logging and observability.

Using a BAML client registry

Set up a client registry that points at the Respan gateway and pass Respan parameters via headers:
from baml_client.sync_client import b
from baml_client.types import Resume
from baml_py import ClientRegistry
import base64
import json
import os

def extract_resume_with_respan(raw_resume: str) -> Resume:
    cr = ClientRegistry()

    respan_params = {
        "customer_identifier": "123",
    }
    respan_params_encoded = base64.b64encode(
        json.dumps(respan_params).encode("utf-8")
    ).decode("utf-8")

    cr.add_llm_client(
        name="RespanClient",
        provider="openai",
        options={
            "model": "gpt-4o",
            "api_key": os.getenv("RESPAN_API_KEY"),
            "base_url": "https://api.respan.ai/api",
            "headers": {"X-Data-Respan-Params": respan_params_encoded},
        },
    )
    cr.set_primary("RespanClient")

    response = b.ExtractResume(
        raw_resume,
        baml_options={"client_registry": cr},
    )
    return response

Using a BAML config file

Define a Respan client directly in your .baml config:
client<llm> Respan {
  provider openai
  options {
    model gpt-4o
    api_key env.RESPAN_API_KEY
    base_url "https://api.respan.ai/api"
  }
}
Then reference client "RespanClient" in your BAML function definitions.

Observability

With this integration, Respan auto-captures:
  • BAML function calls — each decorated function as a span
  • LLM calls — model, token usage (input/output/total)
  • Gateway routing — requests routed through Respan gateway
  • Errors — failed calls and error details
View traces on the Traces page.