RubyLLM

RubyLLM provides a unified Ruby interface for GPT, Claude, Gemini, and more. Since Respan is OpenAI-compatible, you can route all RubyLLM requests through the Respan gateway by pointing the OpenAI base URL to Respan and get full observability automatically.

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

RubyLLM does not have a Ruby-side tracing instrumentor. Route all calls through the Respan gateway to capture every request as a trace.

Setup

1

Install RubyLLM

$gem install ruby_llm

Or add it to your Gemfile.

1gem "ruby_llm"
2

Set environment variables

$export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

No provider key needed — the Respan gateway handles provider authentication.

3

Configure RubyLLM with Respan

1RubyLLM.configure do |config|
2 config.openai_api_key = ENV["RESPAN_API_KEY"]
3 config.openai_api_base = "https://api.respan.ai/api"
4end
4

Make your first request

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2response = chat.ask("Hello, world!")
3puts response.content
5

View your trace

Open the Traces page to see your gateway-routed calls with prompts, tokens, and cost.

Switch models

For OpenAI models it works directly. For non-OpenAI models (Claude, Gemini, etc.), add provider: :openai and assume_model_exists: true to route them through the Respan gateway.

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2chat = RubyLLM.chat(model: "claude-3-5-haiku-20241022", provider: :openai, assume_model_exists: true)
3chat = RubyLLM.chat(model: "gemini-2.0-flash", provider: :openai, assume_model_exists: true)
4
5response = chat.ask("Tell me about artificial intelligence")
6puts response.content

provider: :openai doesn’t mean the model is from OpenAI — it tells RubyLLM to use the OpenAI API protocol to send the request. Without it, RubyLLM would call the provider directly, bypassing Respan. assume_model_exists: true skips RubyLLM’s local model registry check.

See the full model list.

Streaming

1chat = RubyLLM.chat(model: "gpt-4.1-nano")
2chat.ask("Explain quantum computing") do |chunk|
3 print chunk.content
4end

Multi-tenancy with contexts

Use RubyLLM contexts to isolate per-tenant configuration.

1tenant_ctx = RubyLLM.context do |config|
2 config.openai_api_key = tenant.respan_api_key
3 config.openai_api_base = "https://api.respan.ai/api"
4end
5
6chat = tenant_ctx.chat(model: "gpt-4.1-nano")
7response = chat.ask("Hello!")

Rails integration

Set your Respan config in an initializer.

1# config/initializers/ruby_llm.rb
2RubyLLM.configure do |config|
3 config.openai_api_key = ENV["RESPAN_API_KEY"]
4 config.openai_api_base = "https://api.respan.ai/api"
5end

Use acts_as_chat as normal — all LLM calls will be routed through Respan.