Hugging Face (tracing)

Hugging Face provides open models, datasets, and the Transformers library. Respan’s Hugging Face instrumentation wraps the upstream Transformers OpenTelemetry instrumentor and normalizes TextGenerationPipeline.__call__ spans into Respan text-generation traces with model metadata, generation parameters, prompt and completion content, token counts, latency, and workflow context.

Create an account at platform.respan.ai and grab an API key.

Run npx @respan/cli setup to set up with your coding agent.

See Hugging Face gateway setup for the separate OpenAI-compatible gateway route.

Setup

1

Install packages

$pip install respan-ai respan-instrumentation-huggingface transformers
2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
$# Optional, only needed for private or gated Hugging Face models
$export HF_TOKEN="YOUR_HF_TOKEN"

RESPAN_API_KEY exports traces to Respan. Hugging Face credentials are only needed when the Transformers model you load requires them.

3

Initialize and run

1from respan import Respan, workflow
2from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3from transformers import pipeline
4
5respan = Respan(
6 app_name="huggingface-text-generation",
7 instrumentations=[HuggingFaceInstrumentor()],
8)
9
10generator = pipeline(
11 "text-generation",
12 model="sshleifer/tiny-gpt2",
13)
14
15
16@workflow(name="huggingface_text_generation")
17def run_generation():
18 return generator(
19 "Tracing Hugging Face pipelines helps",
20 max_length=42,
21 temperature=0.35,
22 top_p=0.9,
23 repetition_penalty=1.1,
24 )
25
26
27try:
28 result = run_generation()
29 print(result[0]["generated_text"])
30finally:
31 respan.flush()
32 respan.shutdown()
4

View your trace

Open the Traces page to see the workflow span and its Hugging Face text-generation child span. The child span is stored as log_type=text and includes the model name, prompt, completion, generation parameters, estimated token usage, and latency.

Supported coverage

FeatureCoverage
Single prompt generationCaptures one TextGenerationPipeline.__call__ span with prompt and generated text.
Batch promptsCaptures indexed prompt inputs and the generated response for batched calls.
Workflow contextPreserves @workflow names so Hugging Face spans are grouped under the calling workflow.
Content privacySupports TRACELOOP_TRACE_CONTENT=false to suppress prompt and completion text while keeping model and request metadata.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
instrumentationslist[]Plugin instrumentations to activate, such as HuggingFaceInstrumentor().
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag, such as "production".

Attributes

In Respan()

Set defaults at initialization. These apply to all generated spans.

1from respan import Respan
2from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3
4respan = Respan(
5 instrumentations=[HuggingFaceInstrumentor()],
6 customer_identifier="user_123",
7 metadata={"service": "huggingface-pipeline", "version": "1.0.0"},
8)

With propagate_attributes

Override per-request using a context scope.

1from respan import Respan, propagate_attributes
2from respan_instrumentation_huggingface import HuggingFaceInstrumentor
3
4respan = Respan(instrumentations=[HuggingFaceInstrumentor()])
5
6def handle_request(user_id: str, prompt: str):
7 with propagate_attributes(
8 customer_identifier=user_id,
9 thread_identifier="conv_abc_123",
10 metadata={"plan": "pro"},
11 ):
12 output = generator(prompt, max_length=42)
13 print(output[0]["generated_text"])
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.

Disable captured content

Set TRACELOOP_TRACE_CONTENT=false before initializing Respan to keep prompt and completion text out of span content while still exporting model, generation parameter, workflow, and timing metadata.

$export TRACELOOP_TRACE_CONTENT=false