Directory
AI Inference API Providers
Compare AI inference API providers side by side — pricing, supported models, features, and free tiers. Whether you need the cheapest LLM API, the fastest image generation endpoint, or a provider with OpenAI-compatible routing, find the right fit below.
Covering 2+ providers including OpenRouter, Together AI, Fireworks AI, fal.ai, Replicate, DeepInfra, Groq, and more. Filter by type, category, or browse the full directory.
All
Serverless Proprietary GPU Cloud Aggregator Platform LLM Inference Image Generation Video Generation Audio / Music Embeddings GPU Cloud Multi-Modal
RE
Replicate
Serverless Featured
Run and deploy machine learning models with a cloud API. Pay-per-use with serverless GPU infrastructure.
67 models
Streaming Finetuning Vision
TO
Together AI
Serverless Featured
Serverless and dedicated inference for open-source LLMs, image, video, and audio models. GPU clusters available.
9 models
OpenAI Compat Streaming Finetuning Embeddings Vision Free Tier