Models

Reasoning-focused model with chain-of-thought capabilities rivaling o1.

8 providers

DeepSeek R1 0528

Updated R1 with improved reasoning accuracy and reduced hallucination.

8 providers

DeepSeek V3

Open-weight 671B MoE model with strong coding and reasoning at low cost.

10 providers

DeepSeek V3.1

Updated DeepSeek V3 with improved coding and reasoning performance.

7 providers

DeepSeek V3.2

671B MoE (37B active) 164k ctx Open

Latest DeepSeek V3 with improved reasoning and coding. 671B MoE (37B active), MIT licensed, 164K context.

10 providers

GLM 4.5

Zhipu AI

106B MoE (12B active) 128k ctx Open

Strong reasoning and coding with 106B total, 12B active MoE architecture.

5 providers

GLM 4.6

Zhipu AI

Open-source frontier model with 355B parameters. MIT licensed.

355B 128k ctx Open

5 providers

GLM 5

Zhipu AI

Frontier 744B model trained on Huawei Ascend chips. Open source with strong agentic capabilities.

744B 128k ctx Open

11 providers

GPT OSS 120B

OpenAI

117B MoE (5.1B active) 131k ctx Open

Open-weight 117B MoE model (5.1B active) achieving near o4-mini reasoning. Apache 2.0 licensed, runs on a single 80GB GPU.

7 providers

Gemma 3 12B

Mid-size open-weight Gemma model with vision support.

12B 128k ctx Open

4 providers

Gemma 3 27B

Largest Gemma 3 model with strong reasoning and instruction following.

27B 128k ctx Open

5 providers

Gemma 3 4B

Compact open-weight model for edge and mobile deployment.

4B 32k ctx Open

Gemma 4 12B

Latest Gemma generation optimized for reasoning and agentic workflows.

12B 128k ctx Open

No providers yet

Gemma 4 27B

Most capable open Gemma model with best intelligence-per-parameter.

27B 128k ctx Open

4 providers

Kimi K2

Moonshot AI

1T MoE (32B active) 128k ctx Open

State-of-the-art 1T MoE model with 32B active parameters. Strong coding and agentic capabilities.

9 providers

Kimi K2.5

Moonshot AI

Open-weight multimodal model with agent swarm mode supporting up to 100 parallel sub-agents.

128k ctx Open

12 providers

Llama 3.3 70B

Llama 4 Maverick

Llama 4 Scout

Ministral 3 8B

Mistral AI

Edge-optimized model with vision support. Apache 2.0 licensed.

8B 128k ctx Open

Mistral Large 3

Mistral AI

675B MoE (41B active) 128k ctx Open

Mistral's most capable model. 675B MoE with 41B active parameters.

Mistral Small 4

Mistral AI

Unified model combining fast instruct, deep reasoning, and multimodal chat. 119B params.

119B 256k ctx Open

4 providers

Qwen 3 235B

Largest Qwen 3 model with hybrid thinking modes for flexible reasoning control.

235B MoE 128k ctx Open

10 providers

Qwen 3 32B

Mid-size Qwen 3 with strong coding and math capabilities. Open weight.

32B 128k ctx Open

6 providers

Qwen 3 8B

Compact Qwen 3 for edge and single-GPU deployment. Open weight.

8B 128k ctx Open

6 providers

Qwen 3.5 122B

122B MoE (10B active) 128k ctx Open

Large Qwen 3.5 MoE model with 122B total, 10B active parameters.

Qwen 3.5 35B

35B MoE (3B active) 128k ctx Open

Mid-size Qwen 3.5 MoE model with 35B total, 3B active parameters.

Qwen 3.5 397B

397B MoE (17B active) 128k ctx Open

Largest Qwen 3.5 MoE model with 397B total, 17B active parameters.

Qwen 3.5 72B

Native multimodal Qwen with text, image, and video processing.

72B 128k ctx Open

2 providers

Qwen 3.5 9B