Serverless and dedicated inference for open-source LLMs, image, video, and audio models. GPU clusters available.
Features
OpenAI Compatible
Streaming
Batching
Fine-tuning
Embeddings
Vision
Audio
Function Calling
JSON Mode
Compliance
SOC 2
HIPAA
GDPR
Pricing Model
per token Free tier: Available
Details
Models 9
API Base
https://api.together.xyz EmbeddingsGPU CloudImage GenerationLLM Inference
Model Catalog (9)
| Model | Type | Input $/1M | Output $/1M | Context | Speed | Status |
|---|---|---|---|---|---|---|
| DeepSeek R1 0528 DeepSeek · 671B MoE | llm | $3.00 | $7.00 | — | — | |
| DeepSeek V3.1 DeepSeek · 671B MoE | llm | $0.600 | $1.70 | — | — | |
| GLM 5 Zhipu AI · 744B | llm | $1.00 | $3.20 | — | — | |
| Gemma 3 4B Google · 4B | llm | $0.020 | $0.040 | — | — | |
| Kimi K2 Moonshot AI · 1T MoE (32B active) | llm | $1.00 | $3.00 | — | — | |
| Kimi K2.5 Moonshot AI | llm | $0.500 | $2.80 | — | — | |
| Llama 3.3 70B Meta · 70B | llm | $0.880 | $0.880 | — | — | |
| MiniMax M2.5 MiniMax | llm | $0.300 | $1.20 | — | — | |
| Qwen 3 8B Alibaba · 8B | llm | $0.100 | $0.150 | — | — |