For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordPlatform
DocumentationIntegrationsAPI referenceSDKsChangelog
DocumentationIntegrationsAPI referenceSDKsChangelog
    • Overview
  • Tracing
  • Gateway
      • OpenAI
      • Anthropic
      • Google Gemini
      • Google PaLM
      • Mistral
      • Baseten
      • Fireworks AI
      • Groq
      • Perplexity AI
      • Replicate
      • Together AI
      • Wafer
      • xAI
      • AI21 Labs
      • Cohere
      • Cerebras
      • DeepSeek
      • Azure DeepSeek
      • Inference
      • Moonshot
      • Nebius AI
      • Nextbit
      • Novita AI
      • OpenRouter
      • Parasail
      • Reducto
      • AWS Bedrock
      • Google Vertex AI
      • Azure OpenAI
      • Custom Provider
  • Others
  • Migrating
    • Braintrust
    • Portkey
    • Langfuse
LogoLogo
DiscordPlatform
On this page
  • Prerequisites
  • Supported SDKs / integrations
  • Configuration
  • Via UI (Global)
  • Via code (Per-Request)
  • Override credentials for a particular model (Optional)
  • Supported models
  • Log Cerebras requests
GatewayModel Providers

Cerebras (gateway)

Route Cerebras model calls through Respan Gateway and track requests.
Was this page helpful?
Previous

DeepSeek (gateway)

Route DeepSeek model calls through Respan Gateway and track requests.
Next
Built with
Set up Respan
  1. Sign up — Create an account at platform.respan.ai
  2. Create an API key — Generate one on the API keys page
  3. Add credits or a provider key — Add credits on the Credits page or connect your own provider key on the Integrations page
Use AI

Add the Docs MCP to your AI coding tool to get help building with Respan. No API key needed.

1{
2 "mcpServers": {
3 "respan-docs": {
4 "url": "https://mcp.respan.ai/mcp/docs"
5 }
6 }
7}
This section is for Respan LLM gateway users.

Use Respan Gateway to call Cerebras models while keeping unified observability (logs, cost, latency, and reliability metrics) in Respan.

Prerequisites

  • A Respan API key
  • Cerebras API key (BYOK)
Get Cerebras API key

Retrieve your Cerebras API key from Cerebras Inference Cloud to begin integration.

Supported SDKs / integrations

✅ Supported Frameworks
  • OpenAI SDK
  • LangChain SDK
  • Vercel/OpenAI
  • Vercel/Google
  • LlamaIndex SDK
  • Google GenAI
  • Respan native (Otel)
❌ Unsupported Frameworks
  • Anthropic SDK
  • Vercel/Anthropic

Configuration

There are 2 ways to add your Cerebras credentials to your requests:

Via UI (Global)

1

Navigate to Providers

Go to the Providers page. This page allows you to manage credentials for supported model providers.

Respan Providers Page
2

Add your Cerebras credentials

Select Cerebras and add the required credential fields.

  • cerebras.api_key — your Cerebras API key
3

Configure Load Balancing (Optional)

Add multiple Cerebras credential sets for redundancy. Use the Load balancing weight field to control traffic distribution.

Via code (Per-Request)

You can pass credentials dynamically in the request body. This is useful if you need to use your users’ own API keys (BYOK).

Add the customer_credentials parameter to your Gateway request:

1{
2 "customer_credentials": {
3 "cerebras": {
4 "api_key": "YOUR_CEREBRAS_API_KEY"
5 }
6 }
7}

Override credentials for a particular model (Optional)

Use credential_override when one request or model should use different credentials than the default provider key.

1{
2 "customer_credentials": {
3 "cerebras": {
4 "api_key": "YOUR_CEREBRAS_API_KEY"
5 }
6 },
7 "credential_override": {
8 "gpt-oss-120b": {
9 "api_key": "ANOTHER_CEREBRAS_API_KEY"
10 }
11 }
12}

Supported models

Find the complete and current list of Cerebras model IDs on the Respan Models page. Use the exact model ID shown there in your gateway requests.

Log Cerebras requests

If you are not using the Gateway to proxy requests, you can still log your Cerebras requests to Respan asynchronously. This lets you track cost, latency, and performance metrics for external calls.

Cerebras Python SDK
1import requests
2
3url = "https://api.respan.ai/api/request-logs/create/"
4payload = {
5 "model": "gpt-oss-120b",
6 "prompt_messages": [
7 {
8 "role": "user",
9 "content": "Hello, how are you?"
10 }
11 ],
12 "completion_message": {
13 "role": "assistant",
14 "content": "Hello from Cerebras through Respan."
15 },
16 "cost": 0.001,
17 "generation_time": 1.2,
18 "customer_params": {
19 "customer_identifier": "user_123"
20 }
21}
22headers = {
23 "Authorization": "Bearer YOUR_RESPAN_API_KEY",
24 "Content-Type": "application/json",
25}
26
27response = requests.post(url, headers=headers, json=payload)
Get Started with Logging

View the full guide on setting up comprehensive logging for your LLM stack.