Together AI

The Together AI SDK is the official Python client for Together’s inference platform, providing access to open-source models like Llama, Mistral, and more. Respan gives you full observability over every Together AI call, streamed response, and tool invocation — and gateway routing through the OpenAI-compatible Respan endpoint.

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-together together
2

Set environment variables

$export TOGETHER_API_KEY="YOUR_TOGETHER_API_KEY"
$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

TOGETHER_API_KEY is used for Together requests. RESPAN_API_KEY is used to export traces to Respan.

3

Initialize and run

1from together import Together
2from respan import Respan
3from opentelemetry.instrumentation.together import TogetherAiInstrumentor
4
5respan = Respan(instrumentations=[TogetherAiInstrumentor()])
6
7client = Together()
8
9response = client.chat.completions.create(
10 model="meta-llama/Llama-3-70b-chat-hf",
11 messages=[{"role": "user", "content": "Say hello in three languages."}],
12)
13print(response.choices[0].message.content)
14respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented LLM spans.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
instrumentationslist[]Plugin instrumentations to activate (e.g. TogetherAiInstrumentor()).
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag (e.g. "production").

Attributes

In Respan()

Set defaults at initialization — these apply to all spans.

1from respan import Respan
2from opentelemetry.instrumentation.together import TogetherAiInstrumentor
3
4respan = Respan(
5 instrumentations=[TogetherAiInstrumentor()],
6 customer_identifier="user_123",
7 metadata={"service": "chat-api", "version": "1.0.0"},
8)

With propagate_attributes

Override per-request using a context scope.

1from together import Together
2from respan import Respan, propagate_attributes
3from opentelemetry.instrumentation.together import TogetherAiInstrumentor
4
5respan = Respan(instrumentations=[TogetherAiInstrumentor()])
6client = Together()
7
8def handle_request(user_id: str, question: str):
9 with propagate_attributes(
10 customer_identifier=user_id,
11 thread_identifier="conv_abc_123",
12 metadata={"plan": "pro"},
13 ):
14 response = client.chat.completions.create(
15 model="meta-llama/Llama-3-70b-chat-hf",
16 messages=[{"role": "user", "content": question}],
17 )
18 print(response.choices[0].message.content)
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.

Decorators (optional)

Decorators are not required. All Together AI calls are auto-traced. Use @workflow and @task to add structure when you want to group related calls into a named workflow with nested tasks.

1from together import Together
2from respan import Respan, workflow, task
3from opentelemetry.instrumentation.together import TogetherAiInstrumentor
4
5respan = Respan(instrumentations=[TogetherAiInstrumentor()])
6client = Together()
7
8@task(name="generate_outline")
9def outline(topic: str) -> str:
10 response = client.chat.completions.create(
11 model="meta-llama/Llama-3-70b-chat-hf",
12 messages=[{"role": "user", "content": f"Create a brief outline about: {topic}"}],
13 )
14 return response.choices[0].message.content
15
16@workflow(name="content_pipeline")
17def pipeline(topic: str):
18 print(outline(topic))
19
20pipeline("Benefits of API gateways")
21respan.flush()