# TokenHub ## Docs - [API Authentication Reference](https://docs.inferoute.ai/docs/api-reference/authentication.md): Authenticate with the Inferoute API using Bearer tokens. Every request requires an Authorization: Bearer header with a valid API key from your dashboard. - [POST /v1/chat/completions](https://docs.inferoute.ai/docs/api-reference/chat-completions.md): Create chat completions. Accepts messages array and model, returns assistant message. Supports streaming, function calling, and all OpenAI-compatible params. - [POST /v1/completions](https://docs.inferoute.ai/docs/api-reference/completions.md): Generate text completions from a prompt string. Accepts prompt and model, returns generated text. Legacy endpoint; prefer chat completions for most use cases. - [POST /v1/embeddings](https://docs.inferoute.ai/docs/api-reference/embeddings.md): Generate vector embeddings for text input. Accepts input text and model, returns embedding vectors for semantic search, RAG pipelines, and similarity tasks. - [API Error Codes and Responses](https://docs.inferoute.ai/docs/api-reference/errors.md): TokenHub returns standard HTTP status codes with JSON error bodies. Learn which codes to expect, what they mean, and how to handle retries in your application. - [GET /v1/models](https://docs.inferoute.ai/docs/api-reference/models.md): List all available models on TokenHub. Returns model IDs, provider, context window, and capabilities. Use model IDs in your chat and completion requests. - [TokenHub REST API Reference](https://docs.inferoute.ai/docs/api-reference/overview.md): Complete reference for the TokenHub REST API. Route LLM requests across providers using an OpenAI-compatible interface at https://api.tokenhub.ai/v1. - [Authenticate with TokenHub](https://docs.inferoute.ai/docs/authentication.md): TokenHub authenticates requests with API keys passed as Bearer tokens. Learn how to generate a key, pass it in requests, and follow security best practices. - [How to Select and Use Models Across Providers in TokenHub](https://docs.inferoute.ai/docs/concepts/models.md): Specify models by provider-prefixed name, short alias, or use 'auto' to let TokenHub select the best available model for your request. - [LLM Providers Available Through the TokenHub Platform](https://docs.inferoute.ai/docs/concepts/providers.md): TokenHub connects you to OpenAI, Anthropic, Google, Mistral, and more through a single API — no separate accounts or credentials required. - [How TokenHub Intelligently Routes Your AI Requests](https://docs.inferoute.ai/docs/concepts/routing.md): TokenHub's routing engine evaluates every request and selects the best provider based on your strategy — optimizing for cost, latency, or reliability. - [Understanding Token Usage and Cost Tracking in TokenHub](https://docs.inferoute.ai/docs/concepts/tokens.md): TokenHub tracks prompt and completion tokens on every request, reports usage in the standard OpenAI format, and provides dashboard tools to monitor spend. - [Manage TokenHub API Keys](https://docs.inferoute.ai/docs/configuration/api-keys.md): Create, rotate, and revoke TokenHub API keys from the dashboard. Set expiration dates, restrict keys to specific models, and isolate access by environment. - [Understanding TokenHub Rate Limits](https://docs.inferoute.ai/docs/configuration/rate-limits.md): Learn how TokenHub enforces per-key rate limits, read limit headers in API responses, and handle 429 errors with exponential backoff. - [Set Token and Spend Limits per API Key](https://docs.inferoute.ai/docs/configuration/usage-limits.md): Cap monthly token consumption or USD spend per API key to control costs and allocate budgets across teams, features, or environments. - [TokenHub Frequently Asked Questions](https://docs.inferoute.ai/docs/faq.md): Answers to common questions about TokenHub: SDK compatibility, supported providers, billing, data handling, latency overhead, streaming, and how routing works. - [Optimize LLM Costs with TokenHub Routing](https://docs.inferoute.ai/docs/guides/cost-optimization.md): Reduce AI spend using cost-optimized routing, right-sizing models by task complexity, usage caps, and dashboard monitoring — without changing application logic. - [Configure Automatic Fallback Routing](https://docs.inferoute.ai/docs/guides/fallback-routing.md): Set up automatic failover so TokenHub retries failed requests with backup providers, keeping your application resilient without extra error-handling code. - [Make Your First Request with TokenHub](https://docs.inferoute.ai/docs/guides/first-request.md): Send your first chat completion request through TokenHub's unified API and understand the response fields, provider routing, and common errors. - [Choose the Right Model for Your Use Case](https://docs.inferoute.ai/docs/guides/model-selection.md): Select models by provider-prefixed name, alias, or let TokenHub route automatically. Compare tiers and pick the right balance of quality, speed, and cost. - [What is TokenHub?](https://docs.inferoute.ai/docs/introduction.md): TokenHub is an OpenAI-compatible AI inference router that connects your app to every major LLM provider through a single API endpoint. - [Get Started with TokenHub](https://docs.inferoute.ai/docs/quickstart.md): Learn how to create a TokenHub account, generate an API key, and send your first chat completion request in under five minutes using the OpenAI SDK or curl. - [Troubleshoot Common TokenHub Issues](https://docs.inferoute.ai/docs/troubleshooting.md): Diagnose and fix the most common errors you may encounter when integrating with TokenHub, including auth failures, rate limits, and provider errors. ## OpenAPI Specs - [openapi](https://docs.inferoute.ai/docs/api-reference/openapi.json)