Introducing Inference Hub: Compare AI Inference Providers in One Place

If you’ve ever tried to figure out which AI provider to use for your project, you know the pain. Every provider has a different pricing page, different model names for the same underlying model, and different ways of presenting their costs. Want to run Flux Pro for image generation? You’ll need to compare fal.ai’s per-megapixel pricing against Replicate’s per-image pricing against Together AI’s per-step pricing. Good luck.

We built Inference Hub to fix this.

What is Inference Hub?

Inference Hub is an open directory that aggregates AI inference providers, their model catalogs, and pricing data into a single, searchable interface. Think of it as a comparison engine for the AI inference market.

Right now, we track 9 providers with over 900 model-pricing entries:

Kie.ai — 84 models across chat, image, video, and audio
Muapi — 193 models with the broadest image/video coverage
fal.ai — 80 curated models from their 1,200+ endpoint catalog
OpenRouter — 321 LLM models with unified routing
Replicate — 65 major models across all modalities
Together AI — 24 models with fast open-source inference
Fireworks AI — 19 models optimized for low latency
Groq — 6 models on custom LPU hardware for fastest inference
DeepInfra — 153 models at the cheapest per-token pricing

Why this matters

The AI inference market is fragmented. The same model — say, Llama 3.3 70B — might cost $0.23/Mtok on DeepInfra, $0.59/Mtok on Groq, or $0.88/Mtok on Together AI. That’s a 4x price difference for the same model. Multiply that across a production workload and you’re talking real money.

For image and video models, the comparison is even harder because providers use different units: per-image, per-megapixel, per-second, per-video, per-step, or per-compute-second. We normalize all of this so you can compare apples to apples.

What you can do today

Browse providers — See every provider’s full model catalog with pricing, features, and capabilities at a glance
Compare models — Find who offers the model you need and at what price
Discover alternatives — Find cheaper or faster providers for models you’re already using

What’s next

We’re actively expanding the directory. On the roadmap:

More providers — Cerebras, SiliconFlow, Pixazo, WaveSpeedAI, Runware, and more
Price comparison tables — Side-by-side pricing for the same model across providers
Latency and throughput data — Real benchmarks, not just marketing claims
API status monitoring — Know which providers are actually up
Alerts — Get notified when prices change for models you care about

Built in the open

Inference Hub is a community resource. If you notice incorrect pricing, missing models, or want to suggest a provider to add, reach out. The data is only as good as we keep it.

We scrape and verify pricing data directly from provider APIs and pricing pages. When providers update their pricing, we update ours.

Start exploring at inferencehub.org/providers or jump straight to models to find what you need.