Replicate

Replicate is a platform for running machine learning models in the cloud. It hosts thousands of open-source models and provides a simple API for running predictions without managing infrastructure. Respan gives you full observability over every prediction, input, and output — and gateway routing through the OpenAI-compatible Respan endpoint.

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-replicate replicate
2

Set environment variables

$export REPLICATE_API_TOKEN="YOUR_REPLICATE_API_TOKEN"
$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

REPLICATE_API_TOKEN is used for Replicate predictions. RESPAN_API_KEY is used to export traces to Respan.

3

Initialize and run

1import replicate
2from respan import Respan
3from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4
5respan = Respan(instrumentations=[ReplicateInstrumentor()])
6
7output = replicate.run(
8 "meta/llama-2-70b-chat",
9 input={"prompt": "Say hello in three languages."},
10)
11print("".join(output))
12respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented prediction spans with model, inputs, outputs, and latency.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
instrumentationslist[]Plugin instrumentations to activate (e.g. ReplicateInstrumentor()).
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag (e.g. "production").

Attributes

In Respan()

1from respan import Respan
2from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
3
4respan = Respan(
5 instrumentations=[ReplicateInstrumentor()],
6 customer_identifier="user_123",
7 metadata={"service": "replicate-api", "version": "1.0.0"},
8)

With propagate_attributes

1import replicate
2from respan import Respan, propagate_attributes
3from opentelemetry.instrumentation.replicate import ReplicateInstrumentor
4
5respan = Respan(instrumentations=[ReplicateInstrumentor()])
6
7def handle_request(user_id: str, prompt: str):
8 with propagate_attributes(
9 customer_identifier=user_id,
10 thread_identifier="conv_abc_123",
11 metadata={"plan": "pro"},
12 ):
13 output = replicate.run("meta/llama-2-70b-chat", input={"prompt": prompt})
14 print("".join(output))
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.