European AI inference on Token Factory. Two flavors: fast (low latency) and base (cost-efficient). Batch at 50% off.
Features
OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode
Compliance
SOC 2
HIPAA
GDPR
Pricing Model
per token Free tier: 1 dollar in free credits to start
Details
Models 8
API Base
https://api.studio.nebius.ai/v1 LLM Inference
Model Catalog (8)
| Model | Type | Input $/1M | Output $/1M | Context | Speed | Status |
|---|---|---|---|---|---|---|
| DeepSeek R1 0528 DeepSeek · 671B MoE | llm | $0.800 | $2.40 | — | — | |
| DeepSeek V3 DeepSeek · 671B MoE | llm | $0.500 | $1.50 | — | — | |
| GLM 4.5 Zhipu AI · 106B MoE (12B active) | llm | $0.600 | $2.20 | — | — | |
| GPT OSS 120B OpenAI · 117B MoE (5.1B active) | llm | $0.150 | $0.600 | — | — | |
| Kimi K2 Moonshot AI · 1T MoE (32B active) | llm | $0.500 | $2.40 | — | — | |
| Llama 3.3 70B Meta · 70B | llm | $0.130 | $0.400 | — | — | |
| Qwen 3 235B Alibaba · 235B MoE | llm | $0.200 | $0.600 | — | — | |
| Qwen 3 32B Alibaba · 32B | llm | $0.100 | $0.300 | — | — |