Replicate | Respan Docs

Replicate is a platform for running machine learning models in the cloud. It hosts thousands of open-source models and provides a simple API for running predictions without managing infrastructure. Respan gives you full observability over every prediction, input, and output — and gateway routing through the OpenAI-compatible Respan endpoint.

Set up Respan

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Example projects

Python examples

Tracing

Gateway

Setup

Install packages

$ pip install respan-ai opentelemetry-instrumentation-replicate replicate

Set environment variables

$ export REPLICATE_API_TOKEN="YOUR_REPLICATE_API_TOKEN"
$ export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

REPLICATE_API_TOKEN is used for Replicate predictions. RESPAN_API_KEY is used to export traces to Respan.

Initialize and run

1 import replicate
2 from respan import Respan
3 from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4 
5 respan = Respan(instrumentations=[ReplicateInstrumentor()])
6 
7 output = replicate.run(
8     "meta/llama-2-70b-chat",
9     input={"prompt": "Say hello in three languages."},
10 )
11 print("".join(output))
12 respan.flush()

View your trace

Open the Traces page to see your auto-instrumented prediction spans with model, inputs, outputs, and latency.

Configuration

Parameter	Type	Default	Description
`api_key`	`str \| None`	`None`	Falls back to `RESPAN_API_KEY` env var.
`base_url`	`str \| None`	`None`	Falls back to `RESPAN_BASE_URL` env var.
`instrumentations`	`list`	`[]`	Plugin instrumentations to activate (e.g. `ReplicateInstrumentor()`).
`customer_identifier`	`str \| None`	`None`	Default customer identifier for all spans.
`metadata`	`dict \| None`	`None`	Default metadata attached to all spans.
`environment`	`str \| None`	`None`	Environment tag (e.g. `"production"`).

Attributes

In Respan()

1 from respan import Respan
2 from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
3 
4 respan = Respan(
5     instrumentations=[ReplicateInstrumentor()],
6     customer_identifier="user_123",
7     metadata={"service": "replicate-api", "version": "1.0.0"},
8 )

With propagate_attributes

1 import replicate
2 from respan import Respan, propagate_attributes
3 from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4 
5 respan = Respan(instrumentations=[ReplicateInstrumentor()])
6 
7 def handle_request(user_id: str, prompt: str):
8     with propagate_attributes(
9         customer_identifier=user_id,
10         thread_identifier="conv_abc_123",
11         metadata={"plan": "pro"},
12     ):
13         output = replicate.run("meta/llama-2-70b-chat", input={"prompt": prompt})
14         print("".join(output))

Attribute	Type	Description
`customer_identifier`	`str`	Identifies the end user in Respan analytics.
`thread_identifier`	`str`	Groups related messages into a conversation.
`metadata`	`dict`	Custom key-value pairs. Merged with default metadata.