TokenHub lets you set hard usage limits on any API key so you can control costs before they grow. Limits are enforced at the key level, which means you can give each team, application, or feature its own budget without affecting the rest of your account.Documentation Index
Fetch the complete documentation index at: https://docs.inferoute.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Types of usage limits
| Limit type | Description |
|---|---|
| Monthly token limit | Maximum number of tokens (input + output) the key can consume in a calendar month. |
| Monthly spend limit (USD) | Maximum dollar amount the key can incur in a calendar month. |
| Per-request token limit | Maximum max_tokens value allowed per individual request made with this key. |
402 Payment Required. If the limit is a rate-style cap rather than a spend cap, you may see 429 instead, with an error message that identifies the specific limit that was reached.
Setting a usage limit
Enter your limits
Set one or more of the following values:
- Monthly token limit — enter the maximum number of tokens allowed per month.
- Monthly spend limit — enter the maximum USD amount allowed per month.
- Per-request token limit — enter the maximum
max_tokenspermitted on a single request.
Monitoring usage
The dashboard shows real-time usage versus limits for every key on the Settings → API Keys page and in the dedicated Usage section. You can also configure email alerts to notify you when a key approaches its limit — for example, at 80% of the monthly spend cap.Example: per-team budget isolation
If your organization has separate teams consuming the API, you can issue each team a dedicated key with its own spend limit:| Key name | Monthly spend limit |
|---|---|
team-search | $200 |
team-chat | $500 |
team-internal | $50 |
team-search exhausts its $200 budget, its requests are blocked but every other key continues to work normally. This pattern prevents a single runaway workload from consuming your entire account balance.