Together AI

  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page

Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.

1{
2 "mcpServers": {
3 "respan-docs": {
4 "url": "https://docs.respan.ai/mcp"
5 }
6 }
7}

What is Together AI?

The Together AI SDK is the official Python client for Together’s inference platform, providing access to open-source models like Llama, Mistral, and more. Respan can auto-instrument all Together AI calls for tracing, route them through the Respan gateway, or both.

Setup

1

Install packages

$pip install respan-ai opentelemetry-instrumentation-together together python-dotenv
2

Set environment variables

$export TOGETHER_API_KEY="YOUR_TOGETHER_API_KEY"
$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"
3

Initialize and run

1import os
2from dotenv import load_dotenv
3
4load_dotenv()
5
6from together import Together
7from respan import Respan
8
9# Auto-discover and activate all installed instrumentors (Traceloop)
10respan = Respan(is_auto_instrument=True)
11
12# Calls go directly to Together AI, auto-traced by Respan
13client = Together()
14
15response = client.chat.completions.create(
16 model="meta-llama/Llama-3-70b-chat-hf",
17 messages=[{"role": "user", "content": "Say hello in three languages."}],
18)
19print(response.choices[0].message.content)
20respan.flush()
4

View your trace

Open the Traces page to see your auto-instrumented LLM spans.

This step applies to Tracing and Both setups. The Gateway-only setup does not produce traces.

Configuration

ParameterTypeDefaultDescription
api_keystr | NoneNoneFalls back to RESPAN_API_KEY env var.
base_urlstr | NoneNoneFalls back to RESPAN_BASE_URL env var.
is_auto_instrumentbool | NoneFalseAuto-discover and activate all installed instrumentors via OpenTelemetry entry points.
customer_identifierstr | NoneNoneDefault customer identifier for all spans.
metadatadict | NoneNoneDefault metadata attached to all spans.
environmentstr | NoneNoneEnvironment tag (e.g. "production").

Attributes

Attach customer identifiers, thread IDs, and metadata to spans.

In Respan()

Set defaults at initialization — these apply to all spans.

1from respan import Respan
2
3respan = Respan(
4 is_auto_instrument=True,
5 customer_identifier="user_123",
6 metadata={"service": "chat-api", "version": "1.0.0"},
7)

With propagate_attributes

Override per-request using a context manager.

1from together import Together
2from respan import Respan, workflow, propagate_attributes
3
4respan = Respan(
5 is_auto_instrument=True,
6 metadata={"service": "chat-api", "version": "1.0.0"},
7)
8client = Together()
9
10@workflow(name="handle_request")
11def handle_request(user_id: str, question: str):
12 with propagate_attributes(
13 customer_identifier=user_id,
14 thread_identifier="conv_001",
15 metadata={"plan": "pro"},
16 ):
17 response = client.chat.completions.create(
18 model="meta-llama/Llama-3-70b-chat-hf",
19 messages=[{"role": "user", "content": question}],
20 )
21 print(response.choices[0].message.content)
AttributeTypeDescription
customer_identifierstrIdentifies the end user in Respan analytics.
thread_identifierstrGroups related messages into a conversation.
metadatadictCustom key-value pairs. Merged with default metadata.

Decorators

Use @workflow and @task to create structured trace hierarchies.

1from together import Together
2from respan import Respan, workflow, task
3
4respan = Respan(is_auto_instrument=True)
5client = Together()
6
7@task(name="generate_outline")
8def outline(topic: str) -> str:
9 response = client.chat.completions.create(
10 model="meta-llama/Llama-3-70b-chat-hf",
11 messages=[
12 {"role": "user", "content": f"Create a brief outline about: {topic}"},
13 ],
14 )
15 return response.choices[0].message.content
16
17@workflow(name="content_pipeline")
18def pipeline(topic: str):
19 plan = outline(topic)
20 response = client.chat.completions.create(
21 model="meta-llama/Llama-3-70b-chat-hf",
22 messages=[
23 {"role": "user", "content": f"Write content from this outline: {plan}"},
24 ],
25 )
26 print(response.choices[0].message.content)
27
28pipeline("Benefits of API gateways")
29respan.flush()

Examples

Basic chat

1response = client.chat.completions.create(
2 model="meta-llama/Llama-3-70b-chat-hf",
3 messages=[{"role": "user", "content": "Say hello in three languages."}],
4)
5print(response.choices[0].message.content)

Gateway features

The features below require the Gateway or Both setup from Step 3.

Switch models

Change the model parameter to use 250+ models from different providers through the same gateway.

1# Together AI (Llama)
2response = client.chat.completions.create(model="together_ai/meta-llama/Llama-3-70b-chat-hf", messages=messages)
3
4# OpenAI
5response = client.chat.completions.create(model="gpt-4.1-nano", messages=messages)
6
7# Anthropic
8response = client.chat.completions.create(model="claude-sonnet-4-5-20250929", messages=messages)

See the full model list.

Respan parameters

Pass additional Respan parameters via extra_body for gateway features.

1response = client.chat.completions.create(
2 model="together_ai/meta-llama/Llama-3-70b-chat-hf",
3 messages=[{"role": "user", "content": "Hello"}],
4 extra_body={
5 "customer_identifier": "user_123",
6 "fallback_models": ["gpt-4.1-nano"],
7 "metadata": {"session_id": "abc123"},
8 "thread_identifier": "conversation_456",
9 },
10)

See Respan parameters for the full list.