Nebius

Serverless

European AI inference on Token Factory. Two flavors: fast (low latency) and base (cost-efficient). Batch at 50% off.

OpenAI Compatible

Streaming

Batching

Fine-tuning

Embeddings

Vision

Audio

Function Calling

JSON Mode

SOC 2

HIPAA

GDPR

per token

Free tier: 1 dollar in free credits to start

Models 10

API Base https://api.studio.nebius.ai/v1

LLM Inference

Model Catalog (10)

Model	Type	Input $/1M	Output $/1M	Context	Speed
DeepSeek R1 0528 DeepSeek · 671B MoE	llm	$0.800	$2.40	—	—
DeepSeek V3 DeepSeek · 671B MoE	llm	$0.500	$1.50	—	—
GLM 4.5 Zhipu AI · 106B MoE (12B active)	llm	$0.600	$2.20	—	—
GPT OSS 120B OpenAI · 117B MoE (5.1B active)	llm	$0.150	$0.600	—	—
Kimi K2 Moonshot AI · 1T MoE (32B active)	llm	$0.500	$2.40	—	—
Kimi K2.6 Moonshot AI · 1T MoE (32B active)	llm	$0.950	$4.00	262k	60 t/s
Kimi K2.7 Code Moonshot AI · 1T MoE (32B active)	code	$0.950	$4.00	262k	231.26 t/s
Llama 3.3 70B Meta · 70B	llm	$0.130	$0.400	—	—
Qwen 3 235B Alibaba · 235B MoE	llm	$0.200	$0.600	—	—
Qwen 3 32B Alibaba · 32B	llm	$0.100	$0.300	—	—