TokenHub gives you access to the leading LLM providers through one unified endpoint. Rather than managing separate API keys, SDKs, and request formats for each provider, you send all your requests to TokenHub’s OpenAI-compatible API and let the platform handle provider authentication, request translation, and response normalization. Your application code stays consistent regardless of which provider processes a given request.Documentation Index
Fetch the complete documentation index at: https://docs.inferoute.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Provider pricing is passed through at cost with a small TokenHub routing fee added on top. You pay the provider’s published token rates — TokenHub does not mark up model prices.
Supported providers
| Provider | Models | Chat | Completions | Embeddings |
|---|---|---|---|---|
| OpenAI | GPT-4o, GPT-4, GPT-3.5 Turbo | ✓ | ✓ | ✓ |
| Anthropic | Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku | ✓ | — | — |
| Gemini 1.5 Pro, Gemini 1.5 Flash | ✓ | — | ✓ | |
| Mistral | Mistral Large, Mistral Medium, Mistral Small | ✓ | ✓ | ✓ |
| Meta (hosted) | Llama 3.1 (8B, 70B, 405B) | ✓ | ✓ | — |
OpenAI
TokenHub supports the full OpenAI model lineup including the latest GPT-4o for multimodal tasks, GPT-4 for high-accuracy text tasks, and GPT-3.5 Turbo for fast, cost-efficient workloads. OpenAI models are also available for text embeddings.Anthropic
Claude models are available for chat and instruction-following tasks. Claude 3.5 Sonnet offers the best balance of speed and reasoning quality. Claude 3 Opus is Anthropic’s most capable model for complex analysis. Claude 3 Haiku is optimized for low-latency, high-volume use cases.Mistral
Mistral models are European-hosted and offer strong multilingual performance. Mistral Large is the flagship model; Mistral Medium and Mistral Small trade capability for lower cost and higher throughput.Meta Llama (hosted)
Meta’s Llama 3.1 models are available through TokenHub via third-party hosting partners. You access them through the same TokenHub endpoint — no need to provision your own hosting infrastructure.Authentication and credentials
You do not need a separate account with each provider. TokenHub manages provider credentials on your behalf. All you need is a single TokenHub API key, which you pass as theAuthorization header in every request.
Real-time availability monitoring
TokenHub continuously monitors the health of every provider. Metrics tracked include:- Uptime: whether the provider’s API is returning successful responses
- Latency: current p50 and p95 response times per model
- Error rate: percentage of requests returning 5xx or timeout errors
status.tokenhub.ai.