AI model changelog

New models and fresh benchmark results as they land on the leaderboard, newest first.

June 28, 2026

GPT-5.5 — scored 78.8 on RunFree Score
Claude Opus 4.8 — scored 94.7 on RunFree Score
Claude Opus 4.7 — scored 85.7 on RunFree Score
Claude Sonnet 4.6 — scored 66.2 on RunFree Score
Gemini 3.1 Pro Preview — scored 78 on RunFree Score
Gemini 3 Flash Preview — scored 63.5 on RunFree Score
Gemini 2.5 Pro — scored 22.3 on RunFree Score
GLM 5.2 — scored 86.3 on RunFree Score
DeepSeek V4 Pro — scored 72.4 on RunFree Score
DeepSeek V4 Flash — scored 59.1 on RunFree Score
Kimi K2 Thinking — scored 45.2 on RunFree Score
Llama 4 Maverick — scored 0 on RunFree Score
Qwen3 Max Thinking — scored 94.8 on RunFree Score

June 24, 2026

Fugu Ultra — Sakana added to the leaderboard

June 17, 2026

North Mini Code (free) — Cohere added to the leaderboard

June 16, 2026

GLM 5.2 — Z.ai added to the leaderboard

June 13, 2026

GLM 5.2 — scored 91.2% on GPQA Diamond
GLM 5.2 — scored 40.5% on Humanity's Last Exam
GLM 5.2 — scored 81% on Terminal-Bench

June 12, 2026

Kimi K2.7 Code — MoonshotAI added to the leaderboard

June 9, 2026

Claude Fable Latest — Anthropic added to the leaderboard
Claude Fable 5 — Anthropic added to the leaderboard

June 8, 2026

Nex-N2-Pro — Nex AGI added to the leaderboard

June 4, 2026

Nemotron 3.5 Content Safety (free) — NVIDIA added to the leaderboard
Nemotron 3 Ultra (free) — NVIDIA added to the leaderboard
Nemotron 3 Ultra — NVIDIA added to the leaderboard

June 3, 2026

Qwen3.7 Plus — Qwen added to the leaderboard

June 1, 2026

MiniMax M2.7 — scored 57% on Terminal-Bench

May 31, 2026

MiniMax M3 — MiniMax added to the leaderboard

May 28, 2026

Step 3.7 Flash — StepFun added to the leaderboard
GPT-5.5 — scored 78.2% on Terminal-Bench
GPT-5.5 — scored 41.4% on Humanity's Last Exam
GPT-5.5 Pro — scored 43.1% on Humanity's Last Exam
Claude Opus 4.8 — scored 88.6% on SWE-Bench Verified
Claude Opus 4.8 — scored 74.6% on Terminal-Bench
Claude Opus 4.8 — scored 93.6% on GPQA Diamond
Claude Opus 4.8 — scored 49.8% on Humanity's Last Exam
Claude Opus 4.7 — scored 87.6% on SWE-Bench Verified
Claude Opus 4.7 — scored 66.1% on Terminal-Bench
Claude Opus 4.7 — scored 94.2% on GPQA Diamond
Claude Opus 4.7 — scored 46.9% on Humanity's Last Exam
Gemini 3.1 Pro Preview — scored 44.4% on Humanity's Last Exam

May 27, 2026

Claude Opus 4.8 (Fast) — Anthropic added to the leaderboard
Claude Opus 4.8 — Anthropic added to the leaderboard

May 21, 2026

Qwen3.7 Max — Qwen added to the leaderboard

May 20, 2026

Grok Build 0.1 — xAI added to the leaderboard

May 19, 2026

Gemini 3.5 Flash — Google added to the leaderboard

May 12, 2026

Claude Opus 4.7 (Fast) — Anthropic added to the leaderboard
Perceptron Mk1 — Perceptron added to the leaderboard

May 8, 2026

Ring-2.6-1T — inclusionAI added to the leaderboard

May 7, 2026

Gemini 3.1 Flash Lite — Google added to the leaderboard

May 5, 2026

GPT Chat Latest — OpenAI added to the leaderboard

April 30, 2026

Grok 4.3 — xAI added to the leaderboard
Granite 4.1 8B — IBM added to the leaderboard
Mistral Medium 3.5 — Mistral added to the leaderboard

April 28, 2026

Nemotron 3 Nano Omni (free) — NVIDIA added to the leaderboard
Laguna XS.2 (free) — Poolside added to the leaderboard
Laguna XS.2 — Poolside added to the leaderboard
Laguna M.1 (free) — Poolside added to the leaderboard
Laguna M.1 — Poolside added to the leaderboard

April 27, 2026

Anthropic Claude Haiku Latest — Anthropic added to the leaderboard
OpenAI GPT Mini Latest — OpenAI added to the leaderboard
Google Gemini Pro Latest — Google added to the leaderboard
MoonshotAI Kimi Latest — MoonshotAI added to the leaderboard
Google Gemini Flash Latest — Google added to the leaderboard
Anthropic Claude Sonnet Latest — Anthropic added to the leaderboard
OpenAI GPT Latest — OpenAI added to the leaderboard
Qwen3.5 Plus 2026-04-20 — Qwen added to the leaderboard
Qwen3.6 Flash — Qwen added to the leaderboard
Qwen3.6 35B A3B — Qwen added to the leaderboard
Qwen3.6 Max Preview — Qwen added to the leaderboard
Qwen3.6 27B — Qwen added to the leaderboard

April 24, 2026

GPT-5.5 Pro — OpenAI added to the leaderboard
GPT-5.5 — OpenAI added to the leaderboard
DeepSeek V4 Pro — DeepSeek added to the leaderboard
DeepSeek V4 Flash — DeepSeek added to the leaderboard
DeepSeek V4 Pro — scored 90.1% on GPQA Diamond
DeepSeek V4 Pro — scored 37.7% on Humanity's Last Exam
DeepSeek V4 Pro — scored 87.5% on MMLU-Pro
DeepSeek V4 Pro — scored 80.6% on SWE-Bench Verified