Hugging Face (tracing)

Hugging Face provides open models, datasets, and the Transformers library. Respan’s Hugging Face instrumentation wraps the upstream Transformers OpenTelemetry instrumentor and normalizes TextGenerationPipeline.__call__ spans into Respan text-generation traces with model metadata, generation parameters, prompt and completion content, token counts, latency, and workflow context.

Set up Respan

Create an account at platform.respan.ai and grab an API key.

Run npx @respan/cli setup to set up with your coding agent.

Use Respan Gateway

See Hugging Face gateway setup for the separate OpenAI-compatible gateway route.

Example projects

Python examples

Setup

Install packages

$ pip install respan-ai respan-instrumentation-huggingface transformers

Set environment variables

$ export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$ # Optional, only needed for private or gated Hugging Face models
$ export HF_TOKEN="YOUR_HF_TOKEN"

RESPAN_API_KEY exports traces to Respan. Hugging Face credentials are only needed when the Transformers model you load requires them.

Initialize and run

1 from respan import Respan, workflow
2 from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3 from transformers import pipeline
4 
5 respan = Respan(
6     app_name="huggingface-text-generation",
7     instrumentations=[HuggingFaceInstrumentor()],
8 )
9 
10 generator = pipeline(
11     "text-generation",
12     model="sshleifer/tiny-gpt2",
13 )
14 
15 
16 @workflow(name="huggingface_text_generation")
17 def run_generation():
18     return generator(
19         "Tracing Hugging Face pipelines helps",
20         max_length=42,
21         temperature=0.35,
22         top_p=0.9,
23         repetition_penalty=1.1,
24     )
25 
26 
27 try:
28     result = run_generation()
29     print(result[0]["generated_text"])
30 finally:
31     respan.flush()
32     respan.shutdown()

View your trace

Open the Traces page to see the workflow span and its Hugging Face text-generation child span. The child span is stored as log_type=text and includes the model name, prompt, completion, generation parameters, estimated token usage, and latency.

Supported coverage

Feature	Coverage
Single prompt generation	Captures one `TextGenerationPipeline.__call__` span with prompt and generated text.
Batch prompts	Captures indexed prompt inputs and the generated response for batched calls.
Workflow context	Preserves `@workflow` names so Hugging Face spans are grouped under the calling workflow.
Content privacy	Supports `TRACELOOP_TRACE_CONTENT=false` to suppress prompt and completion text while keeping model and request metadata.

Configuration

Parameter	Type	Default	Description
`api_key`	`str \| None`	`None`	Falls back to `RESPAN_API_KEY` env var.
`base_url`	`str \| None`	`None`	Falls back to `RESPAN_BASE_URL` env var.
`instrumentations`	`list`	`[]`	Plugin instrumentations to activate, such as `HuggingFaceInstrumentor()`.
`customer_identifier`	`str \| None`	`None`	Default customer identifier for all spans.
`metadata`	`dict \| None`	`None`	Default metadata attached to all spans.
`environment`	`str \| None`	`None`	Environment tag, such as `"production"`.

Attributes

In Respan()

Set defaults at initialization. These apply to all generated spans.

1 from respan import Respan
2 from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3 
4 respan = Respan(
5     instrumentations=[HuggingFaceInstrumentor()],
6     customer_identifier="user_123",
7     metadata={"service": "huggingface-pipeline", "version": "1.0.0"},
8 )

With propagate_attributes

Override per-request using a context scope.

1 from respan import Respan, propagate_attributes
2 from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3 
4 respan = Respan(instrumentations=[HuggingFaceInstrumentor()])
5 
6 def handle_request(user_id: str, prompt: str):
7     with propagate_attributes(
8         customer_identifier=user_id,
9         thread_identifier="conv_abc_123",
10         metadata={"plan": "pro"},
11     ):
12         output = generator(prompt, max_length=42)
13         print(output[0]["generated_text"])

Attribute	Type	Description
`customer_identifier`	`str`	Identifies the end user in Respan analytics.
`thread_identifier`	`str`	Groups related messages into a conversation.
`metadata`	`dict`	Custom key-value pairs. Merged with default metadata.

Disable captured content

Set TRACELOOP_TRACE_CONTENT=false before initializing Respan to keep prompt and completion text out of span content while still exporting model, generation parameter, workflow, and timing metadata.

$ export TRACELOOP_TRACE_CONTENT=false