AI Models
Guide

A quick-reference to the major AI models, who makes them, and what they do best.

Updated March 2026

Quick Reference

Model	Company	Best For	Key Differentiator
GPT-5.x	OpenAI	General purpose	Dynamic routing picks the right sub-model per request
Claude 4.6	Anthropic	Coding & reasoning	Top human-preference scores; agentic autonomy
Gemini 3.x	Google DeepMind	Multimodal	Best benchmark breadth; strong pricing; Workspace integration
Grok 4.x	xAI	Real-time info	Live X/Twitter data; multi-agent parallel reasoning
Llama 4	Meta	Open-source	10M token context; fully self-hostable
DeepSeek V3/R1	DeepSeek	Cost efficiency	Frontier performance trained for under $6M; open source
Mistral Large	Mistral	EU compliance	Enterprise-safe; strong open-weight models for regulated industries
Qwen 3	Alibaba	Multilingual	119 languages; models from 0.6B to 1T+ parameters
Kimi K2.5	Moonshot AI	Agentic open-source	1T params; Agent Swarm coordinates up to 100 sub-agents
GLM-5	Zhipu AI	Open frontier	744B MoE; MIT license; trained on zero NVIDIA GPUs
Sonar	Perplexity	Search & research	Search-grounded answers at 1200 tok/s; built on Llama 3.3 + Cerebras
Composer	Cursor	AI-native coding	MoE model trained via RL for software engineering; background agents

Frontier Language Models

GPT-5.x Series

OpenAI · San Francisco

The versatile all-rounder with dynamic internal routing.

GPT-5.4 is the current flagship with a 1M token context window
Uses an internal router to select the right sub-model per request in real time
Native computer use and tool calling for agentic automation
Strong at documentation, unit tests, and complex SQL queries

Best for: Rapid prototyping, general-purpose tasks, API automation

Claude 4.6 Family

Anthropic · San Francisco

The developer favorite for coding, reasoning, and safety.

Opus 4.6 (flagship), Sonnet 4.6 (best value), Haiku 4.5 (fast/cheap)
Leads human-preference leaderboards; strong ARC-AGI-2 scores
Agentic capabilities: autonomous multi-step coding tasks
Known for safety, steerability, and high-quality long-form writing

Best for: Software development, agentic workflows, analysis, writing

Gemini 3.x

Google DeepMind · Mountain View

Multimodal powerhouse with top benchmark breadth.

Gemini 3.1 Pro leads the Artificial Analysis Intelligence Index
Native multimodal: processes text, images, audio, and video natively
1M token context window; deep Google Workspace integration
Strong value at $2/$12 per million input/output tokens

Best for: Multimodal apps, Google Workspace users, cost-conscious teams

Grok 4.x

xAI · Austin

Real-time data meets raw reasoning power.

Grok 4.20 uses native multi-agent architecture with four specialist agents that debate before responding
Real-time integration with X (Twitter) for current events
Scored 100% on AIME 2025 math competition (Heavy variant)
Less filtered personality; positioned as an alternative to corporate AI

Best for: Real-time info, math/science, users who want less guardrails

Sonar (Perplexity)

Perplexity AI · San Francisco

Search-native AI built for grounded, cited answers.

Built on Llama 3.3 70B, further trained for search-grounded factuality
Runs at 1,200 tokens/sec on Cerebras inference hardware
Model family: Sonar, Sonar Pro, Reasoning Pro, Deep Research
Matches GPT-4o on user satisfaction benchmarks

Best for: Research with citations, competitive analysis, fact-checking

Composer 1.5 (Cursor)

Cursor · San Francisco

AI-native coding model built for software engineering.

Mixture-of-experts model trained via RL in real development environments
4x faster generation than comparable frontier coding models
Background agents run tasks autonomously while you work
Cursor Automations trigger agents from GitHub PRs, Slack, Linear, PagerDuty

Best for: Agentic coding, background development, automated workflows

Open Source & Cost-Efficient

Llama 4

Meta · Menlo Park

The leading open-source model family.

Llama 4 Scout: industry-leading 10M token context window
Fully open weights; can be self-hosted for complete data control
Performance competitive with paid frontier models
Requires powerful hardware for full-scale deployment

Best for: Self-hosting, privacy-sensitive orgs, custom fine-tuning

DeepSeek V3.2 / R1

DeepSeek · Hangzhou, China

Frontier performance at a fraction of the cost.

671B parameter MoE model famously trained for under $6M
R1 variant excels at transparent, step-by-step reasoning
V3.2-Speciale matches GPT-5 level performance on key benchmarks
Open-source with competitive benchmark scores; V4 multimodal imminent

Best for: Budget-conscious teams, math/reasoning, open-source self-hosting

Mistral Large 3

Mistral AI · Paris, France

The enterprise-safe European choice.

Strong open-weight models designed for EU AI Act compliance
Default choice for regulated industries (finance, healthcare, gov)
Good multilingual support, especially European languages
Balances capability with privacy and sovereignty requirements

Best for: EU-regulated enterprises, privacy-first deployments

Kimi K2.5

Moonshot AI · Beijing, China

Open-source agentic powerhouse with Agent Swarm.

1T total parameters (32B active), MoE architecture with native vision-language
Agent Swarm coordinates up to 100 specialized sub-agents in parallel
Kimi Code CLI agent rivals Claude Code and Gemini CLI
Backed by Alibaba and HongShan; strong global traction

Best for: Agentic workflows, open-source self-hosting, multimodal tasks

GLM-5

Zhipu AI · Beijing, China

Frontier-class model on a MIT license.

744B parameter MoE model (44B active) with 200K context window
Released under MIT license; trained entirely on Huawei Ascend chips
77.8% on SWE-bench Verified; 50.4% on Humanity's Last Exam
Priced roughly 6x cheaper than comparable proprietary models

Best for: Budget-conscious teams, open-source deployment, coding

Qwen 3

Alibaba Cloud · Hangzhou, China

The multilingual giant.

Supports 119 languages with hybrid Mixture-of-Experts architecture
0.6B to 235B open-weight; Qwen3-Max (1T+) API-only; Qwen3-Coder (480B) for code
Competitive with DeepSeek-R1 and OpenAI o1 on reasoning benchmarks
Qwen3-Coder achieves 69.6% on SWE-Bench Verified, surpassing many frontier models

Best for: International businesses, multilingual apps, flexible deployment

Image & Video Generation

Midjourney v7

Midjourney

Artistic, stylized visuals with strong aesthetic control

Imagen 4

Google

Photorealistic composition, spelling, and typography accuracy

Nano Banana 2

Google

Fast AI image editing, remixing, and style transfers; built on Gemini Flash

DALL-E 4

OpenAI

Integrated with ChatGPT; strong prompt adherence

Stable Diffusion 3.5

Stability AI

Open-source; self-hostable; highly customizable

FLUX.2

Black Forest Labs

From the Stable Diffusion creators; up to 4MP; open-weight Klein variant

Video generation leaders: Google Veo 3.1 (native 4K + vertical video), Sora 2 (OpenAI, up to 25s + Disney partnership), Kling 3.0 (native 4K/60fps), and Runway Gen-4 (creative/cinematic).

When to Use What

Building Software

Build a full-stack app from scratch Claude 4.6
Debug a complex codebase Claude 4.6
Generate unit tests and docs GPT-5.x
Rapid UI prototyping GPT-5.x
Background agents for parallel development Composer 1.5 (Cursor)
Open-source agentic coding Kimi K2.5

Research & Analysis

Analyze a long PDF or contract Gemini 3.x
Summarize a YouTube video Gemini 3.x
Get real-time data on a trending topic Grok 4.x
Get sourced answers with citations Sonar (Perplexity)
Deep competitive research Claude 4.6

Creative & Visual

Create stylized hero images Midjourney v7
Generate photorealistic product shots Imagen 4
Edit and remix existing images Nano Banana 2
Generate a short video from a prompt Veo 3 / Sora

Data & Math

Solve complex math problems step-by-step Grok 4.x
Write and optimize SQL queries GPT-5.x
Transparent chain-of-thought reasoning DeepSeek R1
Analyze spreadsheet data Gemini 3.x

Self-Hosting & Privacy

Run a model on your own infrastructure Llama 4
Fine-tune for a domain-specific task Llama 4
Deploy in EU-regulated environments Mistral Large
Budget-friendly open-source alternative DeepSeek V3
MIT-licensed frontier alternative GLM-5

Writing & Communication

Write long-form technical content Claude 4.6
Draft emails and business writing GPT-5.x
Translate content across 100+ languages Qwen 3
Summarize meeting transcripts Gemini 3.x

There is no single "best" model in 2026. The landscape has shifted from a winner-take-all race to specialized excellence. Match the model to the task.

Sourced directly from company websites and documentation. Updated weekly.

AI ModelsGuide