Ollama | Respan Docs

Ollama lets you run large language models locally. It provides a simple CLI and API for downloading, running, and managing models like Llama, Mistral, and Gemma on your own hardware. Respan gives you full observability over every Ollama call, streamed response, and tool invocation — and gateway routing through the OpenAI-compatible Respan endpoint when you want to mix local and hosted models.

Set up Respan

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Example projects

Python examples

Tracing

Gateway

Setup

Install packages

$ pip install respan-ai opentelemetry-instrumentation-ollama ollama

Set environment variables

$ export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

RESPAN_API_KEY is used to export traces to Respan. Ollama runs locally and needs no API key.

Initialize and run

1 from ollama import Client
2 from respan import Respan
3 from opentelemetry.instrumentation.ollama import OllamaInstrumentor
4 
5 respan = Respan(instrumentations=[OllamaInstrumentor()])
6 
7 client = Client()
8 
9 response = client.chat(
10     model="llama3.1",
11     messages=[{"role": "user", "content": "Say hello in three languages."}],
12 )
13 print(response["message"]["content"])
14 respan.flush()

View your trace

Open the Traces page to see your auto-instrumented Ollama spans with prompts, tokens, and latency.

Configuration

Parameter	Type	Default	Description
`api_key`	`str \| None`	`None`	Falls back to `RESPAN_API_KEY` env var.
`base_url`	`str \| None`	`None`	Falls back to `RESPAN_BASE_URL` env var.
`instrumentations`	`list`	`[]`	Plugin instrumentations to activate (e.g. `OllamaInstrumentor()`).
`customer_identifier`	`str \| None`	`None`	Default customer identifier for all spans.
`metadata`	`dict \| None`	`None`	Default metadata attached to all spans.
`environment`	`str \| None`	`None`	Environment tag (e.g. `"production"`).

Attributes

In Respan()

Set defaults at initialization — these apply to all spans.

1 from respan import Respan
2 from opentelemetry.instrumentation.ollama import OllamaInstrumentor
3 
4 respan = Respan(
5     instrumentations=[OllamaInstrumentor()],
6     customer_identifier="user_123",
7     metadata={"service": "local-llm", "version": "1.0.0"},
8 )

With propagate_attributes

Override per-request using a context scope.

1 from respan import Respan, propagate_attributes
2 from opentelemetry.instrumentation.ollama import OllamaInstrumentor
3 
4 respan = Respan(instrumentations=[OllamaInstrumentor()])
5 
6 def handle_request(user_id: str, question: str):
7     with propagate_attributes(
8         customer_identifier=user_id,
9         thread_identifier="conv_abc_123",
10         metadata={"plan": "pro"},
11     ):
12         response = client.chat(
13             model="llama3.1",
14             messages=[{"role": "user", "content": question}],
15         )
16         print(response["message"]["content"])

Attribute	Type	Description
`customer_identifier`	`str`	Identifies the end user in Respan analytics.
`thread_identifier`	`str`	Groups related messages into a conversation.
`metadata`	`dict`	Custom key-value pairs. Merged with default metadata.