SA

Sambanova

Serverless

AI inference on custom SN50 chips. OpenAI-compatible API with fast output speeds. GPT OSS 120B at 600+ tok/s.

Features

OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode

Compliance

SOC 2
HIPAA
GDPR

Pricing Model

per token

Details

Models 7
API Base https://api.sambanova.ai/v1
LLM Inference

Model Catalog (7)

Model Type Input $/1M Output $/1M Context Speed Status
DeepSeek R1 0528
DeepSeek · 671B MoE
llm $5.00 $7.00
DeepSeek V3
DeepSeek · 671B MoE
llm $3.00 $4.50
DeepSeek V3.2
DeepSeek · 671B MoE (37B active)
llm $3.00 $4.50
GPT OSS 120B
OpenAI · 117B MoE (5.1B active)
llm $0.220 $0.590
Gemma 3 12B
Google · 12B
llm $0.200 $0.350
Llama 3.3 70B
Meta · 70B
llm $0.600 $1.20
Llama 4 Maverick
Meta · 400B (17B active)
llm $0.630 $1.80