TokenHub records token usage for every request you make and surfaces that data both in API responses and in your dashboard. Understanding how tokens are counted helps you predict costs, set appropriate budget limits, and identify opportunities to reduce spend. This page explains how tokens work, where to find usage data, and how to set limits to control costs.Documentation Index
Fetch the complete documentation index at: https://docs.inferoute.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
What are tokens?
Tokens are the units LLMs use to process text. A token is roughly four characters or three-quarters of a word in English, though the exact mapping depends on the model’s tokenizer. Both the text you send (prompt tokens) and the text the model generates (completion tokens) consume tokens, and you are billed for both.Token counts for the same text can vary across models because different providers use different tokenizers. The usage figures in each response reflect the exact token count used by the model that handled that request.
Usage in API responses
Every response from TokenHub includes ausage object in the same format as the OpenAI API. You can read this field directly from the response to track consumption per request.
usage:
prompt_tokens: tokens consumed by your input (system prompt + user messages + any context)completion_tokens: tokens generated by the model in its responsetotal_tokens: sum of prompt and completion tokens; this is what you are billed on
Accessing usage data
Per-request usage in code
Read theusage field directly from the response object in your application.
Dashboard usage views
The TokenHub dashboard provides cumulative usage views that let you analyze consumption over time. You can filter by:- API key: see which keys are driving the most usage
- Model: compare consumption across GPT-4o, Claude, Gemini, and others
- Provider: break down usage by upstream provider
- Date range: view daily, weekly, or monthly trends
Cost calculation
TokenHub bills based on the token rates published by each provider, plus a small routing fee per request.max_tokens can meaningfully reduce costs on high-volume workloads.
Setting usage limits
You can set token budget limits to prevent unexpected spend.Monthly token budgets
Set a monthly token limit for your entire organization or for a specific API key in Dashboard → Settings → Usage Limits. When the limit is reached, TokenHub returns a429 error with a budget_exceeded code until the budget resets at the start of the next calendar month.