Catalog

Models

Explore models and compare pricing across providers.

Claude 3.5 Haiku

Anthropic
LLM

Fast and affordable Claude model for high-throughput tasks.

200k ctx
4 providers

Claude 4.5 Haiku

Anthropic
LLM

Latest Haiku tier with improved capabilities at fast speed and low cost.

200k ctx
6 providers

Claude 4.5 Opus

Anthropic
LLM

Previous Opus generation with strong reasoning and coding.

200k ctx
3 providers

Claude 4.5 Sonnet

Anthropic
LLM

High-capability Claude model balancing intelligence and speed.

200k ctx
5 providers

Claude 4.6 Opus

Anthropic
LLM

Anthropic's most capable model with 1M token context and advanced reasoning.

1M ctx
9 providers

Claude 4.6 Sonnet

Anthropic
LLM

Latest Sonnet with Opus-tier capabilities at Sonnet pricing.

200k ctx
9 providers

Command A

Cohere
LLM

Cohere's latest flagship model for enterprise RAG, tool use, and agents.

256k ctx
2 providers

Command R+

Cohere
LLM

Scalable enterprise model optimized for RAG and multilingual tasks.

128k ctx
3 providers

DeepSeek R1

DeepSeek
LLM

Reasoning-focused model with chain-of-thought capabilities rivaling o1.

671B MoE 128k ctx Open
8 providers

DeepSeek R1 0528

DeepSeek
LLM

Updated R1 with improved reasoning accuracy and reduced hallucination.

671B MoE 128k ctx Open
8 providers

DeepSeek V3

DeepSeek
LLM

Open-weight 671B MoE model with strong coding and reasoning at low cost.

671B MoE 128k ctx Open
10 providers

DeepSeek V3.1

DeepSeek
LLM

Updated DeepSeek V3 with improved coding and reasoning performance.

671B MoE 128k ctx Open
7 providers

DeepSeek V3.2

DeepSeek
LLM

Latest DeepSeek V3 with improved reasoning and coding. 671B MoE (37B active), MIT licensed, 164K context.

671B MoE (37B active) 164k ctx Open
10 providers

GLM 4.5

Zhipu AI
LLM

Strong reasoning and coding with 106B total, 12B active MoE architecture.

106B MoE (12B active) 128k ctx Open
5 providers

GLM 4.6

Zhipu AI
LLM

Open-source frontier model with 355B parameters. MIT licensed.

355B 128k ctx Open
5 providers

GLM 4.7

Zhipu AI
LLM

Optimized for coding, reasoning, and tool use.

128k ctx
12 providers

GLM 5

Zhipu AI
LLM

Frontier 744B model trained on Huawei Ascend chips. Open source with strong agentic capabilities.

744B 128k ctx Open
11 providers

GLM 5.1

Z.ai
LLM

Coding-focused frontier model scoring 94% of Claude Opus 4.6. 744B MoE trained on Huawei Ascend 910B. #1 on SWE-Bench Pro (open source).

744B MoE (40B active) 203k ctx
3 providers

GPT OSS 120B

OpenAI
LLM

Open-weight 117B MoE model (5.1B active) achieving near o4-mini reasoning. Apache 2.0 licensed, runs on a single 80GB GPU.

117B MoE (5.1B active) 131k ctx Open
7 providers

GPT-4o

OpenAI
LLM

OpenAI's flagship multimodal model with strong reasoning, coding, and vision capabilities.

128k ctx
4 providers

GPT-4o Mini

OpenAI
LLM

Cost-efficient smaller GPT-4o variant for lightweight tasks.

128k ctx
4 providers

GPT-5 Codex

OpenAI
LLM

GPT-5 variant optimized for code generation and software engineering.

128k ctx
2 providers

GPT-5 Mini

OpenAI
LLM

Compact GPT-5 variant for lightweight tasks and rapid prototyping.

128k ctx
4 providers

GPT-5 Nano

OpenAI
LLM

Ultra-lightweight GPT-5 for high-speed, low-cost text generation.

128k ctx
4 providers

GPT-5.1 Codex

OpenAI
LLM

GPT-5.1 code-optimized variant.

128k ctx
2 providers

GPT-5.2

OpenAI
LLM

GPT-5.2 general-purpose model.

128k ctx
4 providers

GPT-5.2 Codex

OpenAI
LLM

GPT-5.2 code-optimized variant.

128k ctx
2 providers

GPT-5.3 Codex

OpenAI
LLM

GPT-5.3 code-optimized variant.

128k ctx
2 providers

GPT-5.4

OpenAI
LLM

OpenAI's latest frontier model combining reasoning, coding, and agentic workflows.

128k ctx
7 providers

GPT-5.4 Codex

OpenAI
LLM

Latest GPT-5.4 code-optimized variant with industry-leading coding capabilities.

128k ctx
2 providers

GPT-5.4 Mini

OpenAI
LLM

Compact GPT-5.4 variant balancing capability and cost.

128k ctx
3 providers

GPT-5.4 Nano

OpenAI
LLM

Ultra-lightweight GPT-5.4 for high-speed, low-cost tasks.

128k ctx
2 providers

GPT-5.4 Pro

OpenAI
LLM

Highest capability GPT-5.4 tier with maximum reasoning depth. Premium pricing.

128k ctx
3 providers

Gemini 2.0 Flash

Google
LLM

Fast and efficient Gemini model for high-throughput workloads.

1M ctx
2 providers

Gemini 2.5 Flash

Google
LLM

Speed-optimized Gemini with strong reasoning and multimodal capabilities.

1M ctx
5 providers

Gemini 2.5 Pro

Google
LLM

High-capability Gemini model for complex reasoning and coding tasks.

1M ctx
3 providers

Gemini 3 Flash

Google
LLM

Fast and efficient Gemini 3 model for high-throughput workloads.

1M ctx
6 providers

Gemini 3 Pro

Google
LLM

High-capability Gemini 3 model. Deprecated in favor of 3.1 Pro.

1M ctx
2 providers

Gemini 3.1 Pro

Google
LLM

Google's current flagship model with top benchmark scores and 1M context.

1M ctx
5 providers

Gemma 3 12B

Google
LLM

Mid-size open-weight Gemma model with vision support.

12B 128k ctx Open
4 providers

Gemma 3 27B

Google
LLM

Largest Gemma 3 model with strong reasoning and instruction following.

27B 128k ctx Open
5 providers

Gemma 3 4B

Google
LLM

Compact open-weight model for edge and mobile deployment.

4B 32k ctx Open
3 providers

Gemma 4 12B

Google
LLM

Latest Gemma generation optimized for reasoning and agentic workflows.

12B 128k ctx Open
No providers yet

Gemma 4 27B

Google
LLM

Most capable open Gemma model with best intelligence-per-parameter.

27B 128k ctx Open
4 providers

Grok 3

xAI
LLM

xAI's flagship LLM trained on 200K+ GPUs with real-time web and X integration.

2M ctx
1 provider

Grok 3 Mini

xAI
LLM

Lightweight Grok optimized for cost-efficient reasoning.

2M ctx
3 providers

Grok 4

xAI
LLM

Latest Grok with improved instruction following and reduced hallucination.

2M ctx
3 providers

Kimi K2

Moonshot AI
LLM

State-of-the-art 1T MoE model with 32B active parameters. Strong coding and agentic capabilities.

1T MoE (32B active) 128k ctx Open
9 providers

Kimi K2.5

Moonshot AI
LLM

Open-weight multimodal model with agent swarm mode supporting up to 100 parallel sub-agents.

128k ctx Open
12 providers

Llama 3.3 70B

Meta
LLM

Widely deployed open-weight model with strong general capabilities.

70B 128k ctx Open
13 providers

Llama 4 Maverick

Meta
LLM

Largest open Llama 4 with 128 experts. 400B total, 17B active. Beats GPT-4o on benchmarks.

400B (17B active) 1M ctx Open
6 providers

Llama 4 Scout

Meta
LLM

Natively multimodal MoE model with 10M context. 109B total, 17B active. Fits single H100.

109B (17B active) 10M ctx Open
6 providers

MiniMax M2.5

MiniMax
LLM

MiniMax general-purpose LLM with competitive reasoning and coding capabilities.

128k ctx
6 providers

MiniMax M2.7

MiniMax
LLM

Latest MiniMax general-purpose LLM with improved reasoning.

128k ctx
3 providers

Ministral 3 8B

Mistral AI
LLM

Edge-optimized model with vision support. Apache 2.0 licensed.

8B 128k ctx Open
3 providers

Mistral Large 3

Mistral AI
LLM

Mistral's most capable model. 675B MoE with 41B active parameters.

675B MoE (41B active) 128k ctx Open
3 providers

Mistral Small 4

Mistral AI
LLM

Unified model combining fast instruct, deep reasoning, and multimodal chat. 119B params.

119B 256k ctx Open
4 providers

Nova Lite

Amazon
LLM

Amazon Nova Lite for fast, cost-efficient tasks.

128k ctx
1 provider

Nova Premier

Amazon
LLM

Amazon's most capable LLM for complex reasoning and enterprise tasks.

128k ctx
1 provider

Nova Pro

Amazon
LLM

Amazon Nova Pro for balanced capability and cost.

128k ctx
1 provider

Qwen 3 235B

Alibaba
LLM

Largest Qwen 3 model with hybrid thinking modes for flexible reasoning control.

235B MoE 128k ctx Open
10 providers

Qwen 3 32B

Alibaba
LLM

Mid-size Qwen 3 with strong coding and math capabilities. Open weight.

32B 128k ctx Open
6 providers

Qwen 3 8B

Alibaba
LLM

Compact Qwen 3 for edge and single-GPU deployment. Open weight.

8B 128k ctx Open
6 providers

Qwen 3 Max

Alibaba
LLM

Alibaba's highest capability Qwen 3 model.

128k ctx
4 providers

Qwen 3 Max Thinking

Alibaba
LLM

Qwen 3 Max with extended reasoning and chain-of-thought capabilities.

128k ctx
2 providers

Qwen 3.5 122B

Alibaba
LLM

Large Qwen 3.5 MoE model with 122B total, 10B active parameters.

122B MoE (10B active) 128k ctx Open
3 providers

Qwen 3.5 35B

Alibaba
LLM

Mid-size Qwen 3.5 MoE model with 35B total, 3B active parameters.

35B MoE (3B active) 128k ctx Open
3 providers

Qwen 3.5 397B

Alibaba
LLM

Largest Qwen 3.5 MoE model with 397B total, 17B active parameters.

397B MoE (17B active) 128k ctx Open
3 providers

Qwen 3.5 72B

Alibaba
LLM

Native multimodal Qwen with text, image, and video processing.

72B 128k ctx Open
2 providers

Qwen 3.5 9B

Alibaba
LLM

Compact Qwen 3.5 for single-GPU deployment.

9B 128k ctx Open
2 providers

Qwen 3.6 Plus

Alibaba
LLM

Alibaba's latest flagship with 1M context and advanced agentic coding.

1M ctx
2 providers

o3

OpenAI
LLM

Reasoning-focused model with step-by-step deliberation for complex math, coding, and science tasks.

200k ctx
3 providers

o3 Pro

OpenAI
LLM

Most capable reasoning model in OpenAI's lineup with extended thinking for maximum reliability.

200k ctx
2 providers

o4 Mini

OpenAI
LLM

Lightweight reasoning model balancing chain-of-thought rigor with speed and cost efficiency.

200k ctx
4 providers