Catalog

Models

Explore models and compare pricing across providers.

DeepSeek R1

DeepSeek
LLM

Reasoning-focused model with chain-of-thought capabilities rivaling o1.

671B MoE 128k ctx Open
8 providers

DeepSeek R1 0528

DeepSeek
LLM

Updated R1 with improved reasoning accuracy and reduced hallucination.

671B MoE 128k ctx Open
8 providers

DeepSeek V3

DeepSeek
LLM

Open-weight 671B MoE model with strong coding and reasoning at low cost.

671B MoE 128k ctx Open
10 providers

DeepSeek V3.1

DeepSeek
LLM

Updated DeepSeek V3 with improved coding and reasoning performance.

671B MoE 128k ctx Open
7 providers

DeepSeek V3.2

DeepSeek
LLM

Latest DeepSeek V3 with improved reasoning and coding. 671B MoE (37B active), MIT licensed, 164K context.

671B MoE (37B active) 164k ctx Open
10 providers

GLM 4.5

Zhipu AI
LLM

Strong reasoning and coding with 106B total, 12B active MoE architecture.

106B MoE (12B active) 128k ctx Open
5 providers

GLM 4.6

Zhipu AI
LLM

Open-source frontier model with 355B parameters. MIT licensed.

355B 128k ctx Open
5 providers

GLM 5

Zhipu AI
LLM

Frontier 744B model trained on Huawei Ascend chips. Open source with strong agentic capabilities.

744B 128k ctx Open
11 providers

GPT OSS 120B

OpenAI
LLM

Open-weight 117B MoE model (5.1B active) achieving near o4-mini reasoning. Apache 2.0 licensed, runs on a single 80GB GPU.

117B MoE (5.1B active) 131k ctx Open
7 providers

Gemma 3 12B

Google
LLM

Mid-size open-weight Gemma model with vision support.

12B 128k ctx Open
4 providers

Gemma 3 27B

Google
LLM

Largest Gemma 3 model with strong reasoning and instruction following.

27B 128k ctx Open
5 providers

Gemma 3 4B

Google
LLM

Compact open-weight model for edge and mobile deployment.

4B 32k ctx Open
3 providers

Gemma 4 12B

Google
LLM

Latest Gemma generation optimized for reasoning and agentic workflows.

12B 128k ctx Open
No providers yet

Gemma 4 27B

Google
LLM

Most capable open Gemma model with best intelligence-per-parameter.

27B 128k ctx Open
4 providers

Kimi K2

Moonshot AI
LLM

State-of-the-art 1T MoE model with 32B active parameters. Strong coding and agentic capabilities.

1T MoE (32B active) 128k ctx Open
9 providers

Kimi K2.5

Moonshot AI
LLM

Open-weight multimodal model with agent swarm mode supporting up to 100 parallel sub-agents.

128k ctx Open
12 providers

Llama 3.3 70B

Meta
LLM

Widely deployed open-weight model with strong general capabilities.

70B 128k ctx Open
13 providers

Llama 4 Maverick

Meta
LLM

Largest open Llama 4 with 128 experts. 400B total, 17B active. Beats GPT-4o on benchmarks.

400B (17B active) 1M ctx Open
6 providers

Llama 4 Scout

Meta
LLM

Natively multimodal MoE model with 10M context. 109B total, 17B active. Fits single H100.

109B (17B active) 10M ctx Open
6 providers

Ministral 3 8B

Mistral AI
LLM

Edge-optimized model with vision support. Apache 2.0 licensed.

8B 128k ctx Open
3 providers

Mistral Large 3

Mistral AI
LLM

Mistral's most capable model. 675B MoE with 41B active parameters.

675B MoE (41B active) 128k ctx Open
3 providers

Mistral Small 4

Mistral AI
LLM

Unified model combining fast instruct, deep reasoning, and multimodal chat. 119B params.

119B 256k ctx Open
4 providers

Qwen 3 235B

Alibaba
LLM

Largest Qwen 3 model with hybrid thinking modes for flexible reasoning control.

235B MoE 128k ctx Open
10 providers

Qwen 3 32B

Alibaba
LLM

Mid-size Qwen 3 with strong coding and math capabilities. Open weight.

32B 128k ctx Open
6 providers

Qwen 3 8B

Alibaba
LLM

Compact Qwen 3 for edge and single-GPU deployment. Open weight.

8B 128k ctx Open
6 providers

Qwen 3.5 122B

Alibaba
LLM

Large Qwen 3.5 MoE model with 122B total, 10B active parameters.

122B MoE (10B active) 128k ctx Open
3 providers

Qwen 3.5 35B

Alibaba
LLM

Mid-size Qwen 3.5 MoE model with 35B total, 3B active parameters.

35B MoE (3B active) 128k ctx Open
3 providers

Qwen 3.5 397B

Alibaba
LLM

Largest Qwen 3.5 MoE model with 397B total, 17B active parameters.

397B MoE (17B active) 128k ctx Open
3 providers

Qwen 3.5 72B

Alibaba
LLM

Native multimodal Qwen with text, image, and video processing.

72B 128k ctx Open
2 providers

Qwen 3.5 9B

Alibaba
LLM

Compact Qwen 3.5 for single-GPU deployment.

9B 128k ctx Open
2 providers