Ollama
Ollama lets you run large language models locally. It provides a simple CLI and API for downloading, running, and managing models like Llama, Mistral, and Gemma on your own hardware. Respan gives you full observability over every Ollama call, streamed response, and tool invocation — and gateway routing through the OpenAI-compatible Respan endpoint when you want to mix local and hosted models.
Set up Respan
Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.
Run npx @respan/cli setup to set up with your coding agent.
Example projects
Tracing
Gateway
Setup
Set environment variables
RESPAN_API_KEY is used to export traces to Respan. Ollama runs locally and needs no API key.
View your trace
Open the Traces page to see your auto-instrumented Ollama spans with prompts, tokens, and latency.
Configuration
Attributes
In Respan()
Set defaults at initialization — these apply to all spans.
With propagate_attributes
Override per-request using a context scope.