LLM Providers Available Through the Inferoute Platform

Inferoute gives you access to the leading LLM providers through one unified endpoint. Rather than managing separate API keys, SDKs, and request formats for each provider, you send all your requests to Inferoute’s OpenAI-compatible API and let the platform handle provider authentication, request translation, and response normalization. Your application code stays consistent regardless of which provider processes a given request.

Provider pricing is passed through at cost with a small Inferoute routing fee added on top. You pay the provider’s published token rates — Inferoute does not mark up model prices.

Supported providers

Provider	Models	Chat	Completions	Embeddings
OpenAI	GPT-4o, GPT-4, GPT-3.5 Turbo	✓	✓	✓
Anthropic	Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku	✓	—	—
Google	Gemini 1.5 Pro, Gemini 1.5 Flash	✓	—	✓
Mistral	Mistral Large, Mistral Medium, Mistral Small	✓	✓	✓
Meta (hosted)	Llama 3.1 (8B, 70B, 405B)	✓	✓	—

OpenAI

Inferoute supports the full OpenAI model lineup including the latest GPT-4o for multimodal tasks, GPT-4 for high-accuracy text tasks, and GPT-3.5 Turbo for fast, cost-efficient workloads. OpenAI models are also available for text embeddings.

Anthropic

Claude models are available for chat and instruction-following tasks. Claude 3.5 Sonnet offers the best balance of speed and reasoning quality. Claude 3 Opus is Anthropic’s most capable model for complex analysis. Claude 3 Haiku is optimized for low-latency, high-volume use cases.

Google

Gemini 1.5 Pro supports long-context inputs (up to 1M tokens) and multimodal inputs including images and documents. Gemini 1.5 Flash is a faster, lighter-weight variant suited for latency-sensitive applications.

Mistral

Mistral models are European-hosted and offer strong multilingual performance. Mistral Large is the flagship model; Mistral Medium and Mistral Small trade capability for lower cost and higher throughput.

Meta Llama (hosted)

Meta’s Llama 3.1 models are available through Inferoute via third-party hosting partners. You access them through the same Inferoute endpoint — no need to provision your own hosting infrastructure.

Authentication and credentials

You do not need a separate account with each provider. Inferoute manages provider credentials on your behalf. All you need is a single Inferoute API key, which you pass as the Authorization header in every request.

curl https://api.inferoute.ai/v1/chat/completions \
  -H "Authorization: Bearer $INFEROUTE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-3-5-sonnet",
    "messages": [{"role": "user", "content": "Explain transformer architecture."}]
  }'

Inferoute securely stores and rotates provider credentials. Provider keys are never exposed in API responses or logs.

Real-time availability monitoring

Inferoute continuously monitors the health of every provider. Metrics tracked include:

Uptime: whether the provider’s API is returning successful responses
Latency: current p50 and p95 response times per model
Error rate: percentage of requests returning 5xx or timeout errors

When a provider experiences degraded performance or an outage, Inferoute’s routing engine automatically deprioritizes or bypasses it. If you have the availability or latency strategy active, your requests are rerouted to healthy providers without any action on your part. You can view current provider status at status.inferoute.ai.

Get Started

Core Concepts

Guides

Configuration

Support

LLM Providers Available Through the Inferoute Platform

Supported providers

OpenAI

Anthropic

Google

Mistral

Meta Llama (hosted)

Authentication and credentials

Real-time availability monitoring

​Supported providers

​OpenAI

​Anthropic

​Google

​Mistral

​Meta Llama (hosted)

​Authentication and credentials

​Real-time availability monitoring

Supported providers

OpenAI

Anthropic

Google

Mistral

Meta Llama (hosted)

Authentication and credentials

Real-time availability monitoring