RubyLLM | Respan Docs

RubyLLM provides a unified Ruby interface for GPT, Claude, Gemini, and more. Since Respan is OpenAI-compatible, you can route all RubyLLM requests through the Respan gateway by pointing the OpenAI base URL to Respan and get full observability automatically.

Set up Respan

Create an account at platform.respan.ai and grab an API key. For gateway, also add credits or a provider key.

Run npx @respan/cli setup to set up with your coding agent.

Gateway

RubyLLM does not have a Ruby-side tracing instrumentor. Route all calls through the Respan gateway to capture every request as a trace.

Setup

Install RubyLLM

$ gem install ruby_llm

Or add it to your Gemfile.

1 gem "ruby_llm"

Set environment variables

$ export RESPAN_API_KEY="YOUR_RESPAN_API_KEY"

No provider key needed — the Respan gateway handles provider authentication.

Configure RubyLLM with Respan

1 RubyLLM.configure do |config|
2   config.openai_api_key = ENV["RESPAN_API_KEY"]
3   config.openai_api_base = "https://api.respan.ai/api"
4 end

Make your first request

1 chat = RubyLLM.chat(model: "gpt-4.1-nano")
2 response = chat.ask("Hello, world!")
3 puts response.content

View your trace

Open the Traces page to see your gateway-routed calls with prompts, tokens, and cost.

Switch models

For OpenAI models it works directly. For non-OpenAI models (Claude, Gemini, etc.), add provider: :openai and assume_model_exists: true to route them through the Respan gateway.

1 chat = RubyLLM.chat(model: "gpt-4.1-nano")
2 chat = RubyLLM.chat(model: "claude-3-5-haiku-20241022", provider: :openai, assume_model_exists: true)
3 chat = RubyLLM.chat(model: "gemini-2.0-flash", provider: :openai, assume_model_exists: true)
4 
5 response = chat.ask("Tell me about artificial intelligence")
6 puts response.content

provider: :openai doesn’t mean the model is from OpenAI — it tells RubyLLM to use the OpenAI API protocol to send the request. Without it, RubyLLM would call the provider directly, bypassing Respan. assume_model_exists: true skips RubyLLM’s local model registry check.

See the full model list.

Streaming

1 chat = RubyLLM.chat(model: "gpt-4.1-nano")
2 chat.ask("Explain quantum computing") do |chunk|
3   print chunk.content
4 end

Multi-tenancy with contexts

Use RubyLLM contexts to isolate per-tenant configuration.

1 tenant_ctx = RubyLLM.context do |config|
2   config.openai_api_key = tenant.respan_api_key
3   config.openai_api_base = "https://api.respan.ai/api"
4 end
5 
6 chat = tenant_ctx.chat(model: "gpt-4.1-nano")
7 response = chat.ask("Hello!")

Rails integration

Set your Respan config in an initializer.

1 # config/initializers/ruby_llm.rb
2 RubyLLM.configure do |config|
3   config.openai_api_key = ENV["RESPAN_API_KEY"]
4   config.openai_api_base = "https://api.respan.ai/api"
5 end

Use acts_as_chat as normal — all LLM calls will be routed through Respan.