Skip to main content
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page
Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.
{
  "mcpServers": {
    "respan-docs": {
      "url": "https://docs.respan.ai/mcp"
    }
  }
}

What is LlamaIndex?

LlamaIndex is a data framework for building LLM applications with data connectors, indices, and query engines. Respan captures every query, retrieval step, and LLM call as spans in a trace.

Setup

1

Install packages

2

Set environment variables

export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
3

Initialize and run

4

View your trace

Open the Traces page to see your query execution with retrieval and LLM spans.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
instrumentationslist[]Plugin instrumentations to activate (e.g. LlamaIndexInstrumentor()).
is_auto_instrumentbool | NoneFalseAuto-discover and activate all installed instrumentors via OpenTelemetry entry points.
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag (e.g. "production").

Attributes

In Respan()

Set defaults at initialization — these apply to all spans.
from respan import Respan
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor

respan = Respan(
    instrumentations=[LlamaIndexInstrumentor()],
    customer_identifier="user_123",
    metadata={"service": "rag-api", "version": "1.0.0"},
)

With propagate_attributes

Override per-request using a context manager.
from respan import Respan, workflow, propagate_attributes
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor

respan = Respan(instrumentations=[LlamaIndexInstrumentor()])

@workflow(name="handle_query")
def handle_query(user_id: str, question: str):
    with propagate_attributes(
        customer_identifier=user_id,
        thread_identifier="conv_001",
        metadata={"plan": "pro"},
    ):
        response = query_engine.query(question)
        print(response)
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.

Decorators

Use @workflow and @task to create structured trace hierarchies.
from respan import Respan, workflow, task
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
from llama_index.core import VectorStoreIndex, Document

respan = Respan(instrumentations=[LlamaIndexInstrumentor()])

@task(name="build_index")
def build_index(texts: list[str]):
    documents = [Document(text=t) for t in texts]
    return VectorStoreIndex.from_documents(documents)

@task(name="query_index")
def query_index(index, question: str) -> str:
    engine = index.as_query_engine()
    return str(engine.query(question))

@workflow(name="rag_pipeline")
def pipeline(texts: list[str], question: str):
    index = build_index(texts)
    answer = query_index(index, question)
    print(answer)

pipeline(
    texts=["Respan provides LLM observability.", "LlamaIndex enables RAG."],
    question="What does Respan do?",
)
respan.flush()

Examples

Basic query

from llama_index.core import VectorStoreIndex, Document

documents = [
    Document(text="The capital of France is Paris."),
    Document(text="The capital of Germany is Berlin."),
]
index = VectorStoreIndex.from_documents(documents)
engine = index.as_query_engine()

response = engine.query("What is the capital of France?")
print(response)

Index and query

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load documents from a directory
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query with similarity search
engine = index.as_query_engine(similarity_top_k=3)
response = engine.query("Summarize the key points in these documents.")
print(response)

Gateway

You can route LLM calls through the Respan gateway by configuring the OpenAI LLM:
from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4.1-nano",
    api_key=os.getenv("RESPAN_API_KEY"),
    api_base="https://api.respan.ai/api",
)
Use the gateway-configured LLM when building your index or query engine.