AI inference on custom SN50 chips. OpenAI-compatible API with fast output speeds. GPT OSS 120B at 600+ tok/s.
Features
OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode
Compliance
SOC 2
HIPAA
GDPR
Pricing Model
per tokenDetails
Models 7
API Base
https://api.sambanova.ai/v1 LLM Inference
Model Catalog (7)
| Model | Type | Input $/1M | Output $/1M | Context | Speed | Status |
|---|---|---|---|---|---|---|
| DeepSeek R1 0528 DeepSeek · 671B MoE | llm | $5.00 | $7.00 | — | — | |
| DeepSeek V3 DeepSeek · 671B MoE | llm | $3.00 | $4.50 | — | — | |
| DeepSeek V3.2 DeepSeek · 671B MoE (37B active) | llm | $3.00 | $4.50 | — | — | |
| GPT OSS 120B OpenAI · 117B MoE (5.1B active) | llm | $0.220 | $0.590 | — | — | |
| Gemma 3 12B Google · 12B | llm | $0.200 | $0.350 | — | — | |
| Llama 3.3 70B Meta · 70B | llm | $0.600 | $1.20 | — | — | |
| Llama 4 Maverick Meta · 400B (17B active) | llm | $0.630 | $1.80 | — | — |