AI Inference API Providers

Compare AI inference API providers side by side — pricing, supported models, features, and free tiers. Whether you need the cheapest LLM API, the fastest image generation endpoint, or a provider with OpenAI-compatible routing, find the right fit below.

Covering 7+ providers including OpenRouter, Together AI, Fireworks AI, fal.ai, Replicate, DeepInfra, Groq, and more. Filter by type, category, or browse the full directory.

All Serverless Proprietary GPU Cloud Aggregator Platform

LLM Inference Image Generation Video Generation Audio / Music Embeddings GPU Cloud Multi-Modal

Amazon Bedrock

Platform Featured

Fully managed AWS service providing foundation models from Anthropic, Meta, Mistral, Cohere, and more. OpenAI-compatible API with enterprise-grade security and compliance.

12 models

Up to $200 in AWS free tier credits for new accounts free

OpenAI Compat Streaming Finetuning Embeddings Functions Vision SOC2 HIPAA GDPR Free Tier

Google

Proprietary Featured

Official Gemini API via Google AI Studio and Vertex AI. Direct access to Gemini, Imagen, and Gemma models.

21 models

Streaming Embeddings Functions Vision Free Tier

Mistral AI

Proprietary Featured

Official Mistral API. Direct access to Mistral Large, Small, and Ministral models. EU data residency available.

4 models

OpenAI Compat Streaming Functions Vision Free Tier

OpenAI

Proprietary Featured

Official OpenAI API. Direct access to GPT, DALL-E, Whisper, and embedding models.

24 models

OpenAI Compat Streaming Embeddings Functions Vision Free Tier

Together AI

Serverless Featured

Serverless and dedicated inference for open-source LLMs, image, video, and audio models. GPU clusters available.

11 models

OpenAI Compat Streaming Finetuning Embeddings Vision Free Tier

AIMLAPI

Aggregator

Unified API for 400+ AI models across text, image, video, and audio. OpenAI-compatible with serverless inference.

27 models

OpenAI Compat Streaming Embeddings Vision Free Tier

Cloudflare Workers AI

Serverless

Edge AI inference across 200+ cities worldwide. Serverless, pay-per-use with OpenAI-compatible API.

15 models

OpenAI Compat Streaming Embeddings Vision Free Tier