FI

Fireworks AI

Serverless Featured

Fast serverless inference for open-source models with per-token pricing, fine-tuning, and on-demand deployments.

Features

OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode

Compliance

SOC 2
HIPAA
GDPR

Pricing Model

per token
Free tier: $1 in free credits

Details

Models 24
API Base https://api.fireworks.ai/inference/v1
Audio / MusicImage GenerationLLM Inference

Model Catalog (24)

Model Type Input $/1M Output $/1M Context Speed Status
DeepSeek R1
DeepSeek · 671B MoE
llm $0.560 $1.68
DeepSeek R1 0528
DeepSeek · 671B MoE
llm $0.560 $1.68
DeepSeek V3
DeepSeek · 671B MoE
llm $0.560 $1.68
GLM 4.5
Zhipu AI · 106B MoE (12B active)
llm $0.900 $0.900
GLM 4.6
Zhipu AI · 355B
llm $0.900 $0.900
GLM 4.7
Zhipu AI
llm $0.600 $2.20
GLM 5
Zhipu AI · 744B
llm $1.00 $3.20
GLM 5.1
Z.ai · 744B MoE (40B active)
llm $1.40 $4.40
GPT OSS 120B
OpenAI · 117B MoE (5.1B active)
llm $0.150 $0.600
Gemma 3 12B
Google · 12B
llm $0.200 $0.200
Gemma 3 27B
Google · 27B
llm $0.900 $0.900
Gemma 3 4B
Google · 4B
llm $0.100 $0.100
Gemma 4 27B
Google · 27B
llm $0.900 $0.900
Kimi K2
Moonshot AI · 1T MoE (32B active)
llm $0.600 $2.50
Kimi K2.5
Moonshot AI
llm $0.600 $3.00
Llama 3.3 70B
Meta · 70B
llm $0.900 $0.900
Llama 4 Maverick
Meta · 400B (17B active)
llm $0.900 $0.900
Llama 4 Scout
Meta · 109B (17B active)
llm $0.900 $0.900
Ministral 3 8B
Mistral AI · 8B
llm $0.200 $0.200
Mistral Large 3
Mistral AI · 675B MoE (41B active)
llm $1.20 $1.20
Qwen 3 235B
Alibaba · 235B MoE
llm $1.20 $1.20
Qwen 3 32B
Alibaba · 32B
llm $0.900 $0.900
Qwen 3 8B
Alibaba · 8B
llm $0.200 $0.200
Qwen 3.6 Plus
Alibaba
llm $0.900 $0.900