TokenHub is an AI inference routing platform that puts every major LLM provider behind a single, unified API. Instead of managing separate credentials, rate limits, and SDKs for OpenAI, Anthropic, Google, Mistral, and others, you connect once to TokenHub and let it route your requests to the right model at the right time — automatically balancing cost, latency, and availability.Documentation Index
Fetch the complete documentation index at: https://docs.inferoute.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Quick Start
Make your first AI request through TokenHub in under five minutes.
Authentication
Generate an API key and learn how to authenticate every request.
Model Selection
Choose models by name, capability tier, or let TokenHub pick automatically.
API Reference
Explore the full OpenAI-compatible REST API with request and response examples.
How TokenHub works
TokenHub sits between your application and every LLM provider. You send a standard request to the TokenHub API, and it handles provider selection, authentication, retries, and fallback — transparently returning the model’s response as if you called the provider directly.Create an account
Sign up at tokenhub.ai and access your dashboard.
Generate an API key
Go to Settings → API Keys and create a new key. Copy it somewhere safe — it won’t be shown again.
Send your first request
Use the TokenHub base URL (
https://api.tokenhub.ai/v1) with any OpenAI-compatible SDK or HTTP client.Key features
Intelligent Routing
Automatically select the fastest, cheapest, or most available provider for each request.
Automatic Fallback
If one provider goes down or rate-limits you, TokenHub retries with a backup automatically.
Cost Optimization
Route to lower-cost models or providers without changing your application code.
Usage Controls
Set per-key token budgets and rate limits to prevent runaway costs.