Models
Explore models and compare pricing across providers.
BGE-M3
BAAIMost popular open-source multilingual embedding model. Supports dense, sparse, and multi-vector retrieval.
DeepSeek R1
DeepSeekReasoning-focused model with chain-of-thought capabilities rivaling o1.
DeepSeek R1 0528
DeepSeekUpdated R1 with improved reasoning accuracy and reduced hallucination.
DeepSeek V3
DeepSeekOpen-weight 671B MoE model with strong coding and reasoning at low cost.
DeepSeek V3.1
DeepSeekUpdated DeepSeek V3 with improved coding and reasoning performance.
DeepSeek V3.2
DeepSeekLatest DeepSeek V3 with improved reasoning and coding. 671B MoE (37B active), MIT licensed, 164K context.
Flux 1 Dev
Black Forest LabsOpen-weight development model for high-quality image generation. 12B parameters.
Flux 1 Schnell
Black Forest LabsFastest Flux model optimized for speed. 12B parameters, Apache 2.0 licensed.
Flux 2 Dev
Black Forest LabsOpen-weight Flux 2 development model.
Flux 2 Klein
Black Forest LabsUltra-fast Flux 2 model generating images in under 0.5 seconds. Available in 4B and 9B.
Flux Kontext Dev
Black Forest LabsOpen-weight context-aware image editing model.
GLM 4.5
Zhipu AIStrong reasoning and coding with 106B total, 12B active MoE architecture.
GLM 4.6
Zhipu AIOpen-source frontier model with 355B parameters. MIT licensed.
GLM 5
Zhipu AIFrontier 744B model trained on Huawei Ascend chips. Open source with strong agentic capabilities.
GPT OSS 120B
OpenAIOpen-weight 117B MoE model (5.1B active) achieving near o4-mini reasoning. Apache 2.0 licensed, runs on a single 80GB GPU.
Gemma 3 12B
GoogleMid-size open-weight Gemma model with vision support.
Gemma 3 27B
GoogleLargest Gemma 3 model with strong reasoning and instruction following.
Gemma 3 4B
GoogleCompact open-weight model for edge and mobile deployment.
Gemma 4 12B
GoogleLatest Gemma generation optimized for reasoning and agentic workflows.
Gemma 4 27B
GoogleMost capable open Gemma model with best intelligence-per-parameter.
HiDream I1
HiDreamOpen-source 17B parameter image model with sparse DiT architecture. MIT licensed.
HunyuanVideo 1.5
TencentOpen-source 8.3B parameter video model with state-of-the-art visual quality on consumer GPUs.
Jina Embeddings V3
Jina AIMultilingual text embedding model with Matryoshka representation learning.
Jina Embeddings V4
Jina AIMultimodal embedding model supporting text, images, and PDFs. Built on Qwen2.5-VL-3B with LoRA adapters.
Kimi K2
Moonshot AIState-of-the-art 1T MoE model with 32B active parameters. Strong coding and agentic capabilities.
Kimi K2.5
Moonshot AIOpen-weight multimodal model with agent swarm mode supporting up to 100 parallel sub-agents.
Kolors
KuaishouOpen-source bilingual text-to-image model trained on billions of pairs. Apache 2.0 licensed.
Llama 3.3 70B
MetaWidely deployed open-weight model with strong general capabilities.
Llama 4 Maverick
MetaLargest open Llama 4 with 128 experts. 400B total, 17B active. Beats GPT-4o on benchmarks.
Llama 4 Scout
MetaNatively multimodal MoE model with 10M context. 109B total, 17B active. Fits single H100.
Ministral 3 8B
Mistral AIEdge-optimized model with vision support. Apache 2.0 licensed.
Mistral Large 3
Mistral AIMistral's most capable model. 675B MoE with 41B active parameters.
Mistral Small 4
Mistral AIUnified model combining fast instruct, deep reasoning, and multimodal chat. 119B params.
Qwen 3 235B
AlibabaLargest Qwen 3 model with hybrid thinking modes for flexible reasoning control.
Qwen 3 32B
AlibabaMid-size Qwen 3 with strong coding and math capabilities. Open weight.
Qwen 3 8B
AlibabaCompact Qwen 3 for edge and single-GPU deployment. Open weight.
Qwen 3 TTS
AlibabaQwen 3 text-to-speech model with voice cloning support.
Qwen 3.5 122B
AlibabaLarge Qwen 3.5 MoE model with 122B total, 10B active parameters.
Qwen 3.5 35B
AlibabaMid-size Qwen 3.5 MoE model with 35B total, 3B active parameters.
Qwen 3.5 397B
AlibabaLargest Qwen 3.5 MoE model with 397B total, 17B active parameters.
Qwen 3.5 72B
AlibabaNative multimodal Qwen with text, image, and video processing.
Qwen 3.5 9B
AlibabaCompact Qwen 3.5 for single-GPU deployment.
Qwen3 Embedding 0.6B
AlibabaCompact Qwen3 embedding for edge and low-resource deployment. Apache 2.0.
Qwen3 Embedding 4B
AlibabaMid-size Qwen3 embedding balancing performance and efficiency. Apache 2.0.
Qwen3 Embedding 8B
Alibaba#1 on MTEB multilingual leaderboard. Best open-source embedding model. Apache 2.0.
SDXL 1.0
Stability AIStable Diffusion XL — widely adopted open-weight image generation model.
Stable Diffusion 3.5 Large
Stability AIStability AI's largest SD3 model with best quality. Open weight.
Stable Diffusion 3.5 Large Turbo
Stability AIDistilled SD3 Large for faster generation with minimal quality loss. Open weight.
Stable Diffusion 3.5 Medium
Stability AIBalanced SD3 model for quality and speed. Open weight.
Voxtral TTS
Mistral AIOpen-weight 4B TTS model. 9 languages, ~90ms TTFA, voice cloning from 3s reference. CC BY NC 4.0.
Wan 2.2
AlibabaTop open-source video model with MoE architecture. Trained on 1.5B videos and 10B images.
Wan 2.5
AlibabaPrevious generation Wan video model with 720p generation 30% faster than 2.2.
Wan 2.6
AlibabaUpdated Wan video model with improved quality and speed.
Wan 2.7 Video
AlibabaLatest Alibaba Wan video model with editing, extending, and reference-to-video capabilities.
Whisper Large V3
OpenAIOpen-weight speech recognition supporting 50+ languages. Handles accents, noise, and technical language.