Skip to content
AI Viewer
Live catalog

Track what changed across the frontier model market.

Follow launches, pricing, context windows, and benchmark snapshots without mixing everything into a fake universal score.

Release tracking Pricing snapshots Capability coverage Benchmark context
Catalog synced Apr 6, 2026 View all changes

Tracked models

200

Models actively covered in the live catalog.

Providers

10

Major labs and platforms tracked in one place.

Recent launches

8

Models first seen in the last 30 days.

Cheapest input

$0.000/M

Lowest listed input rate per million tokens.

Coverage

Filter the market by provider.

Use these filters to isolate releases and model cards by lab. The dataset is refreshed from structured sources and benchmark rows stay clearly labeled.

Last sync: Apr 6, 2026

Methodology

Fresh facts, explicit limits.

OpenRouter API

Primary source for pricing, context windows, and capability flags across providers.

AIViewer editorial layer

Strengths, watchouts, and contextual summaries written by humans, not auto-generated rankings.

AIViewer keeps pricing, capabilities, and benchmark rows separate so the data stays interpretable.

Latest releases

What changed most recently.

Active filter: All providers

Google

Google: Gemma 4 26B A4B

Apr 3, 2026

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — deli...

Context: 262,144 tokens Input: $0.130/M Output: $0.400/M
Google

Google: Gemma 4 31B

Apr 2, 2026

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking...

Context: 262,144 tokens Input: $0.140/M Output: $0.400/M
Qwen

Qwen: Qwen3.6 Plus (free)

Apr 2, 2026

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance infe...

Context: 1,000,000 tokens Input: $0.000/M Output: $0.000/M
xAI

xAI: Grok 4.20 Multi-Agent

Mar 31, 2026

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate to...

Context: 2,000,000 tokens Input: $2.00/M Output: $6.00/M
xAI

xAI: Grok 4.20

Mar 31, 2026

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prom...

Context: 2,000,000 tokens Input: $2.00/M Output: $6.00/M
Google

Google: Lyria 3 Pro Preview

Mar 30, 2026

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality...

Context: 1,048,576 tokens Input: $0.000/M Output: $0.000/M

Tracked models

Catalog view for decision-making.

Showing 200 models across all providers.

Benchmarked models: 12
Google

Google: Gemma 4 26B A4B

Apr 3, 2026

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B qual...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.130/M
Output
$0.400/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 4 31B

Apr 2, 2026

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, nat...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.140/M
Output
$0.400/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.6 Plus (free)

Apr 2, 2026

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to t...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4.20 Multi-Agent

Mar 31, 2026

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesi...

Context
2,000,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$6.00/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4.20

Mar 31, 2026

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delive...

Context
2,000,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$6.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Lyria 3 Pro Preview

Mar 30, 2026

Full-length songs are priced at $0.08 per song. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stereo audio...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
VisionAudio
Last refreshed Apr 6, 2026
Open model page
Google

Google: Lyria 3 Clip Preview

Mar 30, 2026

30 second duration clips are priced at $0.04 per clip. Lyria 3 is Google's family of music generation models, available through the Gemini API. With Lyria 3, you can generate high-quality, 48kHz stere...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
VisionAudio
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.6 Plus Preview (free)

Mar 30, 2026

Qwen 3.6 Plus Preview is the next-generation evolution of the Qwen Plus series, featuring an advanced hybrid architecture that improves efficiency and scalability. It delivers stronger reasoning and m...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useReasoning
Last refreshed Apr 3, 2026
Open model page
OpenAI

OpenAI: GPT-5.4 Nano

Mar 17, 2026

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and high-volume tasks. It supports text and image inputs and is designed for low-lat...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$1.25/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.4 Mini

Mar 17, 2026

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput workloads. It supports text and image inputs with strong performance across reasoni...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$0.750/M
Output
$4.50/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small 4

Mar 16, 2026

Mistral Small 4 is the next major release in the Mistral Small family, unifying the capabilities of several flagship Mistral models into a single system. It combines strong reasoning from Magistral, m...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.600/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4.20 Multi-Agent Beta

Mar 12, 2026

Grok 4.20 Multi-Agent Beta is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and syn...

Context
2,000,000 tokens
Benchmark
Arena Leaderboard: 1,474
Input
$2.00/M
Output
$6.00/M
VisionReasoning
Last refreshed Mar 31, 2026
Open model page
xAI

xAI: Grok 4.20 Beta

Mar 12, 2026

Grok 4.20 Beta is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, d...

Context
2,000,000 tokens
Benchmark
Arena Leaderboard: 1,491
Input
$2.00/M
Output
$6.00/M
Tool useVisionReasoning
Last refreshed Mar 31, 2026
Open model page
Qwen

Qwen: Qwen3.5-9B

Mar 10, 2026

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture. It uses a unified...

Context
256,000 tokens
Benchmark
Not listed yet
Input
$0.050/M
Output
$0.150/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.4 Pro

Mar 5, 2026

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabilities for complex, high-stakes tasks. It features a 1M+ token context window (922...

Context
1,050,000 tokens
Benchmark
Not listed yet
Input
$30/M
Output
$180/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.4

Mar 5, 2026

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ token context window (922K input, 128K output) with support for text and image input...

Context
1,050,000 tokens
Benchmark
Arena Leaderboard: 1,484
Input
$2.50/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.3 Chat

Mar 3, 2026

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more directly helpful. It delivers more accurate answers with better contextualizati...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$1.75/M
Output
$14/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 3.1 Flash Lite Preview

Mar 2, 2026

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemini 2.5 Flash Lite on overall quality and approaches Gemini 2.5 Flash performance...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.250/M
Output
$1.50/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

Feb 26, 2026

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced...

Context
65,536 tokens
Benchmark
Not listed yet
Input
$0.500/M
Output
$3.00/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5-35B-A3B

Feb 25, 2026

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inf...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.163/M
Output
$1.30/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5-27B

Feb 25, 2026

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance. Its overall capabilities a...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.195/M
Output
$1.56/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5-122B-A10B

Feb 25, 2026

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference eff...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$2.08/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5-Flash

Feb 25, 2026

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference effic...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.065/M
Output
$0.260/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 3.1 Pro Preview Custom Tools

Feb 25, 2026

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing overuse of a general bash tool when more efficient third-party or user-defined fu...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$12/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.3-Codex

Feb 24, 2026

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of GPT-5.2-Codex with the broader reasoning and professional knowledge capabilitie...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.75/M
Output
$14/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 3.1 Pro Preview

Feb 19, 2026

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, improved agentic reliability, and more efficient token usage across complex workflows...

Context
1,048,576 tokens
Benchmark
Arena Leaderboard: 1,494
Input
$2.00/M
Output
$12/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Sonnet 4.6

Feb 17, 2026

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and professional work. It excels at iterative development, complex codebase navigation, ...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5 Plus 2026-02-15

Feb 16, 2026

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference e...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$1.56/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3.5 397B A17B

Feb 16, 2026

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher infere...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.390/M
Output
$2.34/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Max Thinking

Feb 9, 2026

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and re...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.780/M
Output
$3.90/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Opus 4.6

Feb 4, 2026

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that operate across entire workflows rather than single prompts, making it especially eff...

Context
1,000,000 tokens
Benchmark
Arena Leaderboard: 1,504
Input
$5.00/M
Output
$25/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder Next

Feb 3, 2026

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per to...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.120/M
Output
$0.750/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Kimi / Moonshot AI

MoonshotAI: Kimi K2.5

Jan 26, 2026

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-directed agent swarm paradigm. Built on Kimi K2 with continued pretraining over appr...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.383/M
Output
$1.72/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT Audio

Jan 19, 2026

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is p...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.50/M
Output
$10/M
Tool useAudio
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT Audio Mini

Jan 19, 2026

A cost-efficient version of GPT Audio. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Input is priced at $0.60 per million token...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$0.600/M
Output
$2.40/M
Tool useAudio
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.2-Codex

Jan 14, 2026

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution ...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.75/M
Output
$14/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 3 Flash Preview

Dec 17, 2025

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance ...

Context
1,048,576 tokens
Benchmark
Arena Leaderboard: 1,474
Input
$0.500/M
Output
$3.00/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small Creative

Dec 16, 2025

Mistral Small Creative is an experimental small model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpose instruction following, and conversati...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.300/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.2 Chat

Dec 10, 2025

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “thi...

Context
128,000 tokens
Benchmark
Arena Leaderboard: 1,478
Input
$1.75/M
Output
$14/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.2 Pro

Dec 10, 2025

GPT-5.2 Pro is OpenAI’s most advanced model, offering major improvements in agentic coding and long context performance over GPT-5 Pro. It is optimized for complex tasks that require step-by-step reas...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$21/M
Output
$168/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.2

Dec 10, 2025

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance compared to GPT-5.1. It uses adaptive reasoning to allocate computation dynamicall...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.75/M
Output
$14/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Devstral 2 2512

Dec 9, 2025

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports e...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$2.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.1-Codex-Max

Dec 4, 2025

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development tasks. It is based on an updated version of the 5.1 reasoning stack and trained ...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Ministral 3 14B 2512

Dec 2, 2025

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language ...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.200/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Ministral 3 8B 2512

Dec 2, 2025

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities....

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.150/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Ministral 3 3B 2512

Dec 2, 2025

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities....

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.100/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Large 3 2512

Dec 1, 2025

Mistral Large 3 2512 is Mistral’s most capable model to date, featuring a sparse mixture-of-experts architecture with 41B active parameters (675B total), and released under the Apache 2.0 license....

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.500/M
Output
$1.50/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.2 Speciale

Dec 1, 2025

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context proce...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$1.20/M
Reasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.2

Dec 1, 2025

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and agentic tool-use performance. It introduces DeepSeek Sparse Attention (DSA), a fin...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$0.380/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Opus 4.5

Nov 24, 2025

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, and long-horizon computer use. It offers strong multimodal capabilities, competit...

Context
200,000 tokens
Benchmark
Arena Leaderboard: 1,474
Input
$5.00/M
Output
$25/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

Nov 20, 2025

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world gr...

Context
65,536 tokens
Benchmark
Arena Leaderboard: 1,485
Input
$2.00/M
Output
$12/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4.1 Fast

Nov 19, 2025

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using the `rea...

Context
2,000,000 tokens
Benchmark
Arena Leaderboard: 1,473
Input
$0.200/M
Output
$0.500/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 3 Pro Preview

Nov 18, 2025

Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. Reason...

Context
1,048,576 tokens
Benchmark
Arena Leaderboard: 1,486
Input
$2.00/M
Output
$12/M
Tool useVisionAudioReasoning
Last refreshed Mar 26, 2026
Open model page
OpenAI

OpenAI: GPT-5.1

Nov 13, 2025

GPT-5.1 is the latest frontier-grade model in the GPT-5 series, offering stronger general-purpose reasoning, improved instruction adherence, and a more natural conversational style compared to GPT-5. ...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.1 Chat

Nov 13, 2025

GPT-5.1 Chat (AKA Instant is the fast, lightweight member of the 5.1 family, optimized for low-latency chat while retaining strong general intelligence. It uses adaptive reasoning to selectively “thin...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.1-Codex

Nov 13, 2025

GPT-5.1-Codex is a specialized version of GPT-5.1 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of c...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5.1-Codex-Mini

Nov 13, 2025

GPT-5.1-Codex-Mini is a smaller and faster version of GPT-5.1-Codex...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$0.250/M
Output
$2.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Kimi / Moonshot AI

MoonshotAI: Kimi K2 Thinking

Nov 6, 2025

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) arc...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.470/M
Output
$2.00/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Perplexity

Perplexity: Sonar Pro Search

Oct 30, 2025

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on ...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Voxtral Small 24B 2507

Oct 30, 2025

Voxtral Small is an enhancement of Mistral Small 3, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translati...

Context
32,000 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.300/M
Tool useAudio
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: gpt-oss-safeguard-20b

Oct 29, 2025

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.075/M
Output
$0.300/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 32B Instruct

Oct 23, 2025

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.104/M
Output
$0.416/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Image Mini

Oct 16, 2025

GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by [GPT-5 Mini](https://openrouter.ai/openai/gpt-5-mini), with GPT Image 1 Mini for efficient image generation. This natively...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$2.50/M
Output
$2.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Haiku 4.5

Oct 15, 2025

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s perfor...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$1.00/M
Output
$5.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 8B Thinking

Oct 14, 2025

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.117/M
Output
$1.36/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 8B Instruct

Oct 14, 2025

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.500/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Image

Oct 14, 2025

[GPT-5](https://openrouter.ai/openai/gpt-5) Image combines OpenAI's GPT-5 model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user e...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$10/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o3 Deep Research

Oct 10, 2025

o3-deep-research is OpenAI's advanced model for deep research, designed to tackle complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds additional cost....

Context
200,000 tokens
Benchmark
Not listed yet
Input
$10/M
Output
$40/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o4 Mini Deep Research

Oct 10, 2025

o4-mini-deep-research is OpenAI's faster, more affordable deep research model—ideal for tackling complex, multi-step research tasks. Note: This model always uses the 'web_search' tool which adds addi...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$8.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Nano Banana (Gemini 2.5 Flash Image)

Oct 7, 2025

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and m...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.300/M
Output
$2.50/M
Vision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 30B A3B Thinking

Oct 6, 2025

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.130/M
Output
$1.56/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 30B A3B Instruct

Oct 6, 2025

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general mu...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.130/M
Output
$0.520/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Pro

Oct 6, 2025

GPT-5 Pro is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instructi...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$15/M
Output
$120/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Sonnet 4.5

Sep 29, 2025

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-ben...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.2 Exp

Sep 29, 2025

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and future architectures. It introduces DeepSeek Sparse Attention (DSA), a fine-grai...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.270/M
Output
$0.410/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Flash Lite Preview 09-2025

Sep 25, 2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.400/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 235B A22B Thinking

Sep 23, 2025

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STE...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$2.60/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 VL 235B A22B Instruct

Sep 23, 2025

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.880/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Max

Sep 23, 2025

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the Janua...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.780/M
Output
$3.90/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder Plus

Sep 23, 2025

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and environment ...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.650/M
Output
$3.25/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Codex

Sep 23, 2025

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of compl...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.1 Terminus (exacto)

Sep 22, 2025

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language cons...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.210/M
Output
$0.790/M
Tool useReasoning
Last refreshed Mar 10, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.1 Terminus

Sep 22, 2025

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language cons...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.210/M
Output
$0.790/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4 Fast

Sep 18, 2025

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model on xAI's [news pos...

Context
2,000,000 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.500/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder Flash

Sep 17, 2025

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling and en...

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.195/M
Output
$0.975/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Next 80B A3B Thinking

Sep 11, 2025

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code s...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.098/M
Output
$0.780/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Next 80B A3B Instruct (free)

Sep 11, 2025

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code ...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Next 80B A3B Instruct

Sep 11, 2025

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code ...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.090/M
Output
$1.10/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen Plus 0728 (thinking)

Sep 8, 2025

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination....

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$0.780/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen Plus 0728

Sep 8, 2025

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination....

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$0.780/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Kimi / Moonshot AI

MoonshotAI: Kimi K2 0905 (exacto)

Sep 4, 2025

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters ...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.600/M
Output
$2.50/M
Tool use
Last refreshed Mar 10, 2026
Open model page
Kimi / Moonshot AI

MoonshotAI: Kimi K2 0905

Sep 4, 2025

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$2.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 30B A3B Thinking 2507

Aug 28, 2025

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking m...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.400/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok Code Fast 1

Aug 26, 2025

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With reasoning traces visible in the response, developers can steer Grok Code for high-quality work flows....

Context
256,000 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$1.50/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3.1

Aug 21, 2025

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase ...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.750/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4o Audio

Aug 14, 2025

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio ...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.50/M
Output
$10/M
Tool useAudio
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Medium 3.1

Aug 13, 2025

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced opera...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$2.00/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Chat

Aug 7, 2025

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications....

Context
128,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Vision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5

Aug 7, 2025

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction f...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Mini

Aug 7, 2025

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency an...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$0.250/M
Output
$2.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-5 Nano

Aug 7, 2025

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to ...

Context
400,000 tokens
Benchmark
Not listed yet
Input
$0.050/M
Output
$0.400/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: gpt-oss-120b (free)

Aug 5, 2025

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: gpt-oss-120b

Aug 5, 2025

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.039/M
Output
$0.190/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: gpt-oss-120b (exacto)

Aug 5, 2025

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B par...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.039/M
Output
$0.190/M
Tool useReasoning
Last refreshed Mar 10, 2026
Open model page
OpenAI

OpenAI: gpt-oss-20b (free)

Aug 5, 2025

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimiz...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: gpt-oss-20b

Aug 5, 2025

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimiz...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.030/M
Output
$0.110/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Opus 4.1

Aug 5, 2025

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable ga...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$15/M
Output
$75/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Codestral 2508

Aug 1, 2025

Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test genera...

Context
256,000 tokens
Benchmark
Not listed yet
Input
$0.300/M
Output
$0.900/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder 30B A3B Instruct

Jul 31, 2025

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, an...

Context
160,000 tokens
Benchmark
Not listed yet
Input
$0.070/M
Output
$0.270/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 30B A3B Instruct 2507

Jul 29, 2025

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quali...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.090/M
Output
$0.300/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 235B A22B Thinking 2507

Jul 25, 2025

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$1.50/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder 480B A35B

Jul 22, 2025

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-con...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.220/M
Output
$1.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 Coder 480B A35B (exacto)

Jul 22, 2025

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-con...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.220/M
Output
$1.80/M
Tool use
Last refreshed Mar 10, 2026
Open model page
Qwen

Qwen: Qwen3 Coder 480B A35B (free)

Jul 22, 2025

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-con...

Context
262,000 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Flash Lite

Jul 22, 2025

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.400/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 235B A22B Instruct 2507

Jul 21, 2025

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized ...

Context
262,144 tokens
Benchmark
Not listed yet
Input
$0.071/M
Output
$0.100/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Kimi / Moonshot AI

MoonshotAI: Kimi K2 0711

Jul 11, 2025

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for a...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.570/M
Output
$2.30/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Devstral Medium

Jul 10, 2025

Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from Devstral Small, it achieves 61.6% on SW...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$2.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Devstral Small 1.1

Jul 10, 2025

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and relea...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.300/M
Tool use
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 4

Jul 9, 2025

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not exposed, reasoning ...

Context
256,000 tokens
Benchmark
Arena Leaderboard: 6,063
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3n 2B (free)

Jul 9, 2025

Gemma 3n E2B IT is a multimodal, instruction-tuned model developed by Google DeepMind, designed to operate efficiently at an effective parameter size of 2B while leveraging a 6B architecture. Based on...

Context
8,192 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small 3.2 24B

Jun 20, 2025

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$0.075/M
Output
$0.200/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Flash

Jun 17, 2025

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, en...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.300/M
Output
$2.50/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Pro

Jun 17, 2025

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through respo...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o3 Pro

Jun 10, 2025

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently be...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$20/M
Output
$80/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 3 Mini

Jun 10, 2025

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible....

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.300/M
Output
$0.500/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 3

Jun 10, 2025

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, hea...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Pro Preview 06-05

Jun 5, 2025

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through respo...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: R1 0528

May 28, 2025

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in siz...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.450/M
Output
$2.15/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Opus 4

May 22, 2025

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in software...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$15/M
Output
$75/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude Sonnet 4

May 22, 2025

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3n 4B

May 20, 2025

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enab...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.020/M
Output
$0.040/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3n 4B (free)

May 20, 2025

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enab...

Context
8,192 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Medium 3

May 7, 2025

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$2.00/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.5 Pro Preview 05-06

May 6, 2025

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through respo...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$1.25/M
Output
$10/M
Tool useVisionAudioReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 4B (free)

Apr 30, 2025

Qwen3-4B is a 4 billion parameter dense language model from the Qwen3 series, designed to support both general-purpose and reasoning-intensive tasks. It introduces a dual-mode architecture—thinking an...

Context
40,960 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useReasoning
Last refreshed Mar 29, 2026
Open model page
Meta

Meta: Llama Guard 4 12B

Apr 29, 2025

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs ...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.180/M
Output
$0.180/M
Vision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 30B A3B

Apr 28, 2025

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tas...

Context
40,960 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.280/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 8B

Apr 28, 2025

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode f...

Context
40,960 tokens
Benchmark
Not listed yet
Input
$0.050/M
Output
$0.400/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 14B

Apr 28, 2025

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode f...

Context
40,960 tokens
Benchmark
Not listed yet
Input
$0.060/M
Output
$0.240/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 32B

Apr 28, 2025

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode ...

Context
40,960 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.240/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen3 235B A22B

Apr 28, 2025

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex r...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.455/M
Output
$1.82/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o4 Mini High

Apr 16, 2025

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$1.10/M
Output
$4.40/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o3

Apr 16, 2025

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$8.00/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o4 Mini

Apr 16, 2025

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonst...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$1.10/M
Output
$4.40/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen2.5 Coder 7B Instruct

Apr 15, 2025

Qwen2.5-Coder-7B-Instruct is a 7B parameter instruction-tuned language model optimized for code-related tasks such as code generation, reasoning, and bug fixing. Based on the Qwen2.5 architecture, it ...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.030/M
Output
$0.090/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4.1

Apr 14, 2025

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and o...

Context
1,047,576 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$8.00/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4.1 Mini

Apr 14, 2025

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instructi...

Context
1,047,576 tokens
Benchmark
Not listed yet
Input
$0.400/M
Output
$1.60/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4.1 Nano

Apr 14, 2025

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, an...

Context
1,047,576 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.400/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 3 Mini Beta

Apr 9, 2025

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate answers immediately, Grok 3 Mini thinks before responding. It’s ideal for reasoning-heavy tasks that don’t...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.300/M
Output
$0.500/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
xAI

xAI: Grok 3 Beta

Apr 9, 2025

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in finance, hea...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Meta

Meta: Llama 4 Maverick

Apr 5, 2025

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forw...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.600/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Meta

Meta: Llama 4 Scout

Apr 5, 2025

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and ...

Context
327,680 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.300/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen2.5 VL 32B Instruct

Mar 24, 2025

Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It exce...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.600/M
Vision
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3 0324

Mar 24, 2025

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) mo...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.770/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o1-pro

Mar 19, 2025

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently b...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$150/M
Output
$600/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small 3.1 24B

Mar 17, 2025

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.030/M
Output
$0.110/M
Vision
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small 3.1 24B (free)

Mar 17, 2025

Mistral Small 3.1 24B Instruct is an upgraded variant of Mistral Small 3 (2501), featuring 24 billion parameters with advanced multimodal capabilities. It provides state-of-the-art performance in text...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool useVision
Last refreshed Mar 29, 2026
Open model page
Google

Google: Gemma 3 4B

Mar 13, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.040/M
Output
$0.080/M
Vision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3 4B (free)

Mar 13, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Vision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3 12B

Mar 13, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.040/M
Output
$0.130/M
Vision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3 12B (free)

Mar 13, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Vision
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4o-mini Search Preview

Mar 12, 2025

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries....

Context
128,000 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.600/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4o Search Preview

Mar 12, 2025

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries....

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.50/M
Output
$10/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3 27B (free)

Mar 12, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Vision
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemma 3 27B

Mar 12, 2025

Gemma 3 introduces multimodality, supporting vision-language input and text outputs. It handles context windows up to 128k tokens, understands over 140 languages, and offers improved math, reasoning, ...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.080/M
Output
$0.160/M
Vision
Last refreshed Apr 6, 2026
Open model page
Perplexity

Perplexity: Sonar Reasoning Pro

Mar 6, 2025

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) Sonar Reason...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$8.00/M
VisionReasoning
Last refreshed Apr 6, 2026
Open model page
Perplexity

Perplexity: Sonar Pro

Mar 6, 2025

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing#detailed-pricing-breakdown-for-sonar-reasoning-pro-and-sonar-pro) For enterpri...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Vision
Last refreshed Apr 6, 2026
Open model page
Perplexity

Perplexity: Sonar Deep Research

Mar 6, 2025

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across complex topics. It autonomously searches, reads, and evaluates sources, refining its ...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$8.00/M
Reasoning
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: QwQ 32B

Mar 5, 2025

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in d...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.150/M
Output
$0.580/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.0 Flash Lite

Feb 25, 2025

Gemini 2.0 Flash Lite offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemin...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.075/M
Output
$0.300/M
Tool useVisionAudio
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude 3.7 Sonnet

Feb 24, 2025

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rap...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude 3.7 Sonnet (thinking)

Feb 24, 2025

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rap...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$3.00/M
Output
$15/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Saba

Feb 17, 2025

Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.200/M
Output
$0.600/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Meta

Llama Guard 3 8B

Feb 12, 2025

Llama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classificati...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.020/M
Output
$0.060/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o3 Mini High

Feb 12, 2025

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly exc...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$1.10/M
Output
$4.40/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Google

Google: Gemini 2.0 Flash

Feb 5, 2025

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro...

Context
1,048,576 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.400/M
Tool useVisionAudio
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen VL Plus

Feb 4, 2025

Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixe...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.137/M
Output
$0.410/M
Vision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen VL Max

Feb 1, 2025

Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in delivering optimal performance for a broader spectrum of complex tasks....

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.520/M
Output
$2.08/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen-Turbo

Feb 1, 2025

Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suitable for simple tasks....

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.033/M
Output
$0.130/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen2.5 VL 72B Instruct

Feb 1, 2025

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images....

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.800/M
Output
$0.800/M
Vision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen-Plus

Feb 1, 2025

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination....

Context
1,000,000 tokens
Benchmark
Not listed yet
Input
$0.260/M
Output
$0.780/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen: Qwen-Max

Feb 1, 2025

Qwen-Max, based on Qwen2.5, provides the best inference performance among [Qwen models](/qwen), especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 2...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$1.04/M
Output
$4.16/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o3 Mini

Jan 31, 2025

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter,...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$1.10/M
Output
$4.40/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Mistral Small 3

Jan 30, 2025

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tune...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.050/M
Output
$0.080/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: R1 Distill Qwen 32B

Jan 29, 2025

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperfor...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.290/M
Output
$0.290/M
Reasoning
Last refreshed Apr 6, 2026
Open model page
Perplexity

Perplexity: Sonar

Jan 27, 2025

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources. It is designed for companies seeking to integrate lightweight question-and-ans...

Context
127,072 tokens
Benchmark
Not listed yet
Input
$1.00/M
Output
$1.00/M
Vision
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: R1 Distill Llama 70B

Jan 23, 2025

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The mo...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.700/M
Output
$0.800/M
Reasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: R1

Jan 20, 2025

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully ...

Context
64,000 tokens
Benchmark
Not listed yet
Input
$0.700/M
Output
$2.50/M
Tool useReasoning
Last refreshed Apr 6, 2026
Open model page
DeepSeek

DeepSeek: DeepSeek V3

Dec 26, 2024

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported ev...

Context
163,840 tokens
Benchmark
Not listed yet
Input
$0.320/M
Output
$0.890/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: o1

Dec 17, 2024

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason using ...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$15/M
Output
$60/M
Tool useVisionReasoning
Last refreshed Apr 6, 2026
Open model page
Meta

Meta: Llama 3.3 70B Instruct

Dec 6, 2024

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimize...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$0.100/M
Output
$0.320/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Meta

Meta: Llama 3.3 70B Instruct (free)

Dec 6, 2024

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimize...

Context
65,536 tokens
Benchmark
Not listed yet
Input
$0.000/M
Output
$0.000/M
Tool use
Last refreshed Apr 6, 2026
Open model page
OpenAI

OpenAI: GPT-4o (2024-11-20)

Nov 20, 2024

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with upl...

Context
128,000 tokens
Benchmark
Not listed yet
Input
$2.50/M
Output
$10/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral Large 2411

Nov 18, 2024

Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released together with [Pixtral Large 2411](/mistralai/pixtral-large-2411) It provides a significant upgrade on the pr...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$6.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral Large 2407

Nov 18, 2024

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch annou...

Context
131,072 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$6.00/M
Tool use
Last refreshed Apr 6, 2026
Open model page
Mistral

Mistral: Pixtral Large 2411

Nov 18, 2024

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The model is able to understand documents, charts and natural images....

Context
131,072 tokens
Benchmark
Not listed yet
Input
$2.00/M
Output
$6.00/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Qwen

Qwen2.5 Coder 32B Instruct

Nov 11, 2024

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvem...

Context
32,768 tokens
Benchmark
Not listed yet
Input
$0.660/M
Output
$1.00/M
Capability flags pending
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude 3.5 Haiku

Nov 3, 2024

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for d...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$0.800/M
Output
$4.00/M
Tool useVision
Last refreshed Apr 6, 2026
Open model page
Anthropic

Anthropic: Claude 3.5 Sonnet

Oct 21, 2024

New Claude 3.5 Sonnet delivers better-than-Opus capabilities, faster-than-Sonnet speeds, at the same Sonnet prices. Sonnet is particularly good at: - Coding: Scores ~49% on SWE-Bench Verified, higher...

Context
200,000 tokens
Benchmark
Not listed yet
Input
$6.00/M
Output
$30/M
Tool useVision
Last refreshed Apr 5, 2026
Open model page