Limits

Set cost, request, and token limits per API key to control spending and usage.

Limits let you cap spending, request volume, or token usage per API key. Set thresholds that warn you or block requests when exceeded.


Limit types

TypeWhat it capsExample
CostSpending in USD100/hour,100/hour, 1,000/day
RequestsNumber of API calls10,000 requests/hour
TokensTotal input/output tokens1M tokens/day

Threshold modes

Each limit policy supports two modes that can be stacked:

Warn — when usage hits the threshold, you get a notification but requests keep flowing. Use this for soft budgets where you want visibility without disrupting production.

Block — when usage hits the threshold, the gateway rejects further requests with a 429 status. Use this for hard spending caps.

You can combine both on a single policy. For example, warn at 50/hourandblockat50/hour and block at 100/hour.


Set up limits

1

Go to API Keys

Navigate to Settings > Limits.

2

Add a policy

Click Add Policy and choose the metric type (cost, requests, or tokens).

3

Configure thresholds

Set the time period (hour, day, or month) and define your warn and/or block thresholds.

4

Add notifications (optional)

Configure alerts via Slack (webhook URL) or email to get notified when thresholds are hit.

5

Save

Save the policy. It takes effect immediately.


Notifications

Get alerted when limits are triggered:

  • Slack — send alerts to a channel via webhook URL
  • Email — notify team members directly