TO

Together AI

Serverless Featured

Serverless and dedicated inference for open-source LLMs, image, video, and audio models. GPU clusters available.

Features

OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode

Compliance

SOC 2
HIPAA
GDPR

Pricing Model

per token
Free tier: Available

Details

Models 9
API Base https://api.together.xyz
EmbeddingsGPU CloudImage GenerationLLM Inference

Model Catalog (9)

Model Type Input $/1M Output $/1M Context Speed Status
DeepSeek R1 0528
DeepSeek · 671B MoE
llm $3.00 $7.00
DeepSeek V3.1
DeepSeek · 671B MoE
llm $0.600 $1.70
GLM 5
Zhipu AI · 744B
llm $1.00 $3.20
Gemma 3 4B
Google · 4B
llm $0.020 $0.040
Kimi K2
Moonshot AI · 1T MoE (32B active)
llm $1.00 $3.00
Kimi K2.5
Moonshot AI
llm $0.500 $2.80
Llama 3.3 70B
Meta · 70B
llm $0.880 $0.880
MiniMax M2.5
MiniMax
llm $0.300 $1.20
Qwen 3 8B
Alibaba · 8B
llm $0.100 $0.150