For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DiscordPlatform
DocumentationIntegrationsAPI referenceSDKsChangelog
DocumentationIntegrationsAPI referenceSDKsChangelog
  • Get started
    • Overview
    • Trace your first call
    • Run your first eval
    • Use gateway & prompts
    • Live demo
  • Observability
    • Users
  • Gateway
      • Fallback
      • Load balancing
      • Retries
    • Limits
  • Admin
    • API keys
    • Provider keys
    • Workspaces & projects
    • Collaborate
  • Resources
  • Security & Support
    • Support
    • Status
LogoLogo
DiscordPlatform
GatewayReliability

Retries

Automatically retry failed requests with configurable attempts and backoff.
Was this page helpful?
Previous

Respan caching

Cache LLM responses to reduce costs and latency.
Next
Built with

For the complete list of all request parameters, see API reference.


When an LLM call fails, the system detects the error and retries the request to prevent failovers.

Via UI
Via code

Go to the Retries page and enable retries and set the number of retries and the initial retry time.

Retries Page
Supported parameters
Something went wrong!
Automatic retry logic

Respan will automatically retry failed requests if the failure is a rate limit issue from the upstream provider:

1model # User requested model
2model_params = respan_models_data[model]
3# Exponential backoff retry logic
4for i in range(0, fallback_retries):
5 try:
6 response = respan_response_with_load_balance(model)
7 return response
8 break
9 except RateLimitError:
10 if model_params["fallback_models"]:
11 for fallback_model in model_params["fallback_models"]:
12 response = respan_response_with_load_balance(fallback_model)
13 return response
14 sleep(2 ** i)
15 except Exception as e:
16 raise e