DE

DeepInfra

Serverless Featured

Serverless inference for open-source LLMs and generative models. Pay-per-token with fast cold starts.

Features

OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode

Compliance

SOC 2
HIPAA
GDPR

Pricing Model

per token
Free tier: Available

Details

Models 19
API Base https://api.deepinfra.com/v1/openai
Audio / MusicImage GenerationLLM Inference

Model Catalog (19)

Model Type Input $/1M Output $/1M Context Speed Status
DeepSeek V3.2
DeepSeek · 671B MoE (37B active)
llm $0.260 $0.380
DeepSeek V4 Flash
DeepSeek
llm $0.100 $0.200
DeepSeek V4 Pro
DeepSeek
llm $1.30 $2.60
Flux 2 Max
Black Forest Labs
image gen $0.070/img
GLM 4.7
Zhipu AI
llm $0.060 $0.400
GLM 5
Zhipu AI · 744B
llm $0.600 $2.08
Kimi K2.5
Moonshot AI · 1T MoE (32B active)
llm $0.450 $2.25
Kimi K2.6
Moonshot AI · 1T MoE (32B active)
llm $0.750 $3.50 262k
Kimi K2.7 Code
Moonshot AI · 1T MoE (32B active)
code $0.740 $3.50 262k
MiniMax M2.5
MiniMax
llm $0.150 $1.15
Qwen 3 Max
Alibaba
llm $1.20 $6.00
Qwen 3 Max Thinking
Alibaba
llm $1.20 $6.00
Qwen 3 TTS
Alibaba
text to_speech $0.00002/character see notes
Qwen 3.5 122B
Alibaba · 122B MoE (10B active)
llm $0.290 $2.40
Qwen 3.5 35B
Alibaba · 35B MoE (3B active)
llm $0.140 $1.00
Qwen 3.5 397B
Alibaba · 397B MoE (17B active)
llm $0.450 $3.00
Qwen 3.5 72B
Alibaba · 72B
llm $0.260 $2.60
Qwen 3.5 9B
Alibaba · 9B
llm $0.100 $0.150
Qwen3.7 Max
Alibaba
llm $2.50 $7.50