| GPT-5.x | OpenAI | General purpose | GPT-5.5 API flagship (Apr 24) — $5/1M input, $30/M output; GPT-5.5 Instant (May 5) is the new ChatGPT default for all tiers — leaner variant, ~30% fewer words/lines, optimized for low latency; dynamic routing across sub-models |
| Claude 4.x Family | Anthropic | Coding & reasoning | Fable 5 (Jun 9) — 1M context, 128K output, $10/$50 per 1M tokens, 95% SWE-bench Verified, 80% SWE-bench Pro; Opus 4.8 (May 28) — 69.2% SWE-bench Pro, Dynamic Workflows for parallel subagents; Sonnet 4.6 (Feb 17) for speed/cost; Haiku 4.5 hits ~90% of Sonnet 4.5 coding at a fraction of the price |
| Gemini 3.x | Google DeepMind | Multimodal | Gemini 3.5 Pro (I/O 2026): limited Vertex enterprise preview; broad GA unconfirmed as of June 2026; leads 3.1 Pro on reasoning; Gemini 3.5 Flash (May 20) is the new default in the Gemini app at $1.50/$9.00 per M tokens; 3.1 Ultra remains the top reasoning tier |
| Grok 4.x | xAI | Real-time info | Grok 4.3 (Apr 30) is the cost-efficient API model at $1.25/$2.50 per M tokens with always-on reasoning and native video input (5-min clips); Grok 4 Heavy is the premium multi-agent variant (256K context, first model to hit 50% on Humanity's Last Exam) gated behind the $300/mo SuperGrok Heavy tier |
| Llama 4 | Meta | Open-source | 10M token context; fully self-hostable; Behemoth still in training/unreleased |
| Muse Spark | Meta Superintelligence Labs | Frontier reasoning | Released Apr 8, 2026 — Meta's first closed-weight frontier model; multimodal reasoning, thought compression, parallel sub-agent orchestration; top-5 on AI Intelligence Index (52) |
| DeepSeek V4 | DeepSeek | Cost efficiency | Released April 24, 2026 — V4-Pro (1.6T MoE, $1.74/1M input rack rate; 75% off through May 31 → ~$0.435 effective) and V4-Flash (284B, $0.14/1M input); MIT, 1M context |
| Mistral 3 Family | Mistral | EU compliance | Large 3, Magistral (reasoning), Devstral (open-source coding agent), Small 4, Voxtral — enterprise-safe with data sovereignty |
| Qwen 3.7 | Alibaba | Multilingual | Qwen 3.7 Max (May 19, closed-weight) — 56.6 AI Intelligence Index, $2.50/$7.50 per 1M tokens; Qwen 3.7 Plus with vision GA June 1; Qwen3.5 open-weight (397B) and Qwen3.6-Plus still available |
| Microsoft MAI | Microsoft | Speech & media AI | MAI-Transcribe-1, MAI-Voice-1, MAI-Image-2, plus Phi-4-reasoning (Apr 10) — Microsoft's own foundation stack on Foundry |
| Amazon Nova 2 | Amazon / AWS | AWS-native enterprise | Released Dec 2025 — Lite (1M context, MCP), Pro (reasoning), Sonic (speech-to-speech), Omni (multimodal); powers Nova Act agent (90%+ task reliability) |
| MiniMax M2.5 | MiniMax | Cost-efficient coding | Feb 12, 2026 — 230B MoE / 10B active; 80.2% SWE-Bench Verified; $0.15/$1.15 per 1M tokens; M2.7 follow-up in late April |
| Command A | Cohere | Enterprise RAG | Released Apr 7, 2026 — 111B MoE / 11B active; tuned for retrieval-augmented generation; cost-efficient enterprise tier |
| Gemma 4 | Google | On-device open-weight | 31B ranks #3 on Arena AI (1452 Elo); E2B/E4B optimized for Android; Apache 2.0 |
| Kimi K2.6 | Moonshot AI | Agentic open-source | Released Apr 20 — 1T MoE, 32B active; Agent Swarm scales to 300 sub-agents; 58.6 SWE-Bench Pro (top open model); Modified MIT |
| GLM-5.1 | Zhipu AI | Open frontier | Updated Apr 6, 2026; 744B MoE; MIT license; GPQA 0.9; trained on zero NVIDIA GPUs |
| GPT-5.4-Cyber | OpenAI | Defensive cybersecurity | TAC-gated fine-tune; lowered security refusals; binary reverse engineering; partners: CrowdStrike, Cloudflare, Palo Alto, Cisco |
| Claude Mythos 5 | Anthropic | Security research | Graduated from Preview to full release June 9, 2026 alongside Fable 5; Project Glasswing expanded to 150+ organizations across 15+ countries for critical-infrastructure cybersecurity; GPQA 0.9; high-cost pricing retained to gate access |
| Sonar | Perplexity | Search & research | Search-grounded, citation-first answers at 1,200 tok/s on Cerebras inference |
| Composer 2.5 | Cursor | AI-native coding | May 18, 2026 — built on Kimi K2.5 with Cursor's post-training; 79.8% SWE-bench Multilingual, 69.3% Terminal-Bench 2.0; matches Opus 4.7 at ~10× cheaper ($0.50/$2.50 per M tokens) |
| SubQ | Subquadratic | Architecturally novel | First commercial subquadratic LLM (May 5, 2026) — Subquadratic Sparse Attention (SSA) scales ~linearly; 1M token production context (12M in research config); Claude Opus-level coding at ~1/20th the compute cost |