Compare every
inference
provider
Models, pricing, latency, and features — all in one place. Stop tab-hopping between provider docs.
Featured Providers
Top inference API providers for your AI workloads
KIE AI
AggregatorAffordable AI API aggregator offering 259+ models across chat, image, video, and music at discounted prices.
OpenAI
ProprietaryOfficial OpenAI API. Direct access to GPT, DALL-E, Whisper, and embedding models.
Muapi
AggregatorAI API aggregator with 315+ model endpoints across text, image, video, and audio at competitive prices.
Anthropic
ProprietaryOfficial Claude API. Direct access to Claude Opus, Sonnet, and Haiku models.
fal.ai
ServerlessFast inference platform for generative media — image, video, audio, and 3D models with serverless GPU infrastructure.
Official Gemini API via Google AI Studio and Vertex AI. Direct access to Gemini, Imagen, and Gemma models.
Together AI
ServerlessServerless and dedicated inference for open-source LLMs, image, video, and audio models. GPU clusters available.
Mistral AI
ProprietaryOfficial Mistral API. Direct access to Mistral Large, Small, and Ministral models. EU data residency available.
Latest Models
Recently added models across all providers
CogView-4
Zhipu AIZhipu AI image generation model.
Qwen 3.5 397B
AlibabaLargest Qwen 3.5 MoE model with 397B total, 17B active parameters.
Qwen 3.5 122B
AlibabaLarge Qwen 3.5 MoE model with 122B total, 10B active parameters.
Qwen 3.5 35B
AlibabaMid-size Qwen 3.5 MoE model with 35B total, 3B active parameters.
Qwen 3.5 9B
AlibabaCompact Qwen 3.5 for single-GPU deployment.
Qwen 3 Max
AlibabaAlibaba's highest capability Qwen 3 model.