AI Coding Models: The Full Reference Table (July 2026)

I keep researching this for my own work, so I turned it into a reference. One big table, everything in it: prices, benchmarks, licenses, where to access each model. Scan it, pick what you need, go build.

Quick mental model before the table: Agent = Model + Harness. The model is the intelligence, the harness (Claude Code, Cursor, Cline, Aider...) turns it into something that navigates your repo and fixes its own mistakes. Pick both.

The table

Model	Creator	License	Price in/out per 1M	SWE-Bench	Direct access	Also on
Claude Fable 5	Anthropic	Closed	$10 / $50	80.3% Pro	platform.claude.com	Claude Code, claude.ai
Claude Opus 4.8	Anthropic	Closed	$5 / $25	69.2% Pro, 88.6% Verified	platform.claude.com	Claude Code, Cursor, OpenRouter, Bedrock, Vertex
Claude Sonnet 5	Anthropic	Closed	~$3 / $15	new, near Opus 4.8	platform.claude.com	Claude Code, Cursor
Claude Sonnet 4.6	Anthropic	Closed	$3 / $15	strong	platform.claude.com	Cursor, Windsurf, Cline, OpenRouter
Claude Haiku 4.5	Anthropic	Closed	$1 / $5	73.3% Verified	platform.claude.com	OpenRouter
GPT-5.5	OpenAI	Closed	$5 / $30	58.6% Pro, 88.7% Verified	platform.openai.com	Codex CLI, Copilot, Cursor, Azure
GPT-5.3-Codex	OpenAI	Closed	$1.75 / $14	coding-tuned	platform.openai.com	Codex CLI
Gemini 3.1 Pro	Google	Closed	$2 / $12 ($4 / $18 above 200K)	54.2% Pro, 80.6% Verified	aistudio.google.com	Gemini CLI, Vertex, Cursor, OpenRouter
Grok 4.x	xAI	Closed	~$2.60 / $7.80	Tier B	x.ai	Cursor, OpenRouter
DeepSeek V4-Pro	DeepSeek	MIT	$1.74 / $3.48	80.6% Verified	platform.deepseek.com	OpenRouter, Together, Morph, HuggingFace
DeepSeek V4-Flash	DeepSeek	MIT	$0.14 / $0.28	79% Verified	platform.deepseek.com	OpenRouter (:free tier), Morph, Ollama
DeepSeek V3.2	DeepSeek	MIT	~$0.23 / ~$1	best value classic	platform.deepseek.com	OpenRouter, Ollama
GLM 5.2	Z.ai	Open	$1.40 / $4.40 (~$0.45 / $3.31 avg on OR)	62.1% Pro (vendor)	z.ai	OpenRouter, HuggingFace, GLM Coding Plan
GLM-4.7-Flash	Z.ai	Open	free	decent	z.ai	OpenRouter, Ollama
Kimi K2.6	Moonshot	Open	$0.95 / $4.00	80.2% Verified, 58.6% Pro	platform.moonshot.ai	OpenRouter, Together, Groq, HuggingFace
Kimi K2.7 Code	Moonshot	Apache 2.0	~K2.6 rates, 30% fewer thinking tokens	coding-tuned	platform.moonshot.ai	OpenRouter, HuggingFace
MiniMax M3	MiniMax	Open	$0.60 / $2.40 (~$0.10 / $1.21 avg on OR)	80.5% Verified	minimax.io	OpenRouter, Atlas Cloud
MiniMax M2.7	MiniMax	Open	$0.30 / $1.20	56.2% Pro	minimax.io	Together, Atlas Cloud
Qwen3.7-Max	Alibaba	Closed API	via Alibaba Cloud	80%+ Verified	alibabacloud.com	OpenRouter
Qwen 3.6-27B	Alibaba	Apache 2.0	free local (22GB VRAM)	77.2% Verified	Ollama, HuggingFace	Together, OpenRouter
Qwen 3.6 Plus	Alibaba	Open	$0.50 / $3.00	61.6% Terminal-Bench	Alibaba Cloud	Together, OpenRouter
Qwen3 Coder	Alibaba	Apache 2.0	free on OpenRouter	best free coding model	OpenRouter :free	Ollama
Qwen 2.5 Coder 32B	Alibaba	Apache 2.0	free local (18GB+ VRAM)	best local classic	Ollama	LM Studio, Continue.dev
Nemotron 3 Ultra	NVIDIA	OpenMDW	~$0.42 / $2.61 (OR avg)	#2 open on AA index	NVIDIA NIM	OpenRouter (:free route), HuggingFace
North Mini Code	Cohere	Apache 2.0	free local	33.4 AA Coding Index	HuggingFace	Ollama, vLLM
Codestral 22B	Mistral	Open	free local (12GB VRAM)	best autocomplete local	Ollama	Mistral API, OpenRouter
Devstral Small 24B	Mistral	Open	free local (16GB VRAM)	near-frontier local	Ollama	Mistral API
Llama 4 Scout	Meta	Open	free local, :free on OR	solid, 10M context	HuggingFace	Ollama, Groq, Together
Poolside Laguna M.1	Poolside	Closed	free tier	#1 on Kilo usage	Kilo Code	Poolside platform
openPangu 2.0	Huawei	Open	free local	competitive	HuggingFace	vLLM self-host

How to read it:

Prices are first-party rates as of late June 2026. OR = OpenRouter weighted average, noted where it differs a lot from direct.
OpenRouter adds ~5.5% credit fee ($0.80 minimum) and 5% BYOK fee above 1M requests/month. The per-token rates themselves match provider list prices.
Benchmarks mix vendor-reported and independent numbers. Compare within the same column only, and test on your own code before committing.
Fable 5 caveat: a US export-control directive suspended access on June 12, expected back for US users around July 1. If you are outside the US (like me), verify before building on it.

The agents (harnesses)

Agent	Type	Models	Link
Claude Code	First-party CLI/app	Claude family	claude.com/claude-code
Codex CLI	First-party CLI	GPT family	openai.com/codex
Gemini CLI	First-party CLI	Gemini	github.com/google-gemini/gemini-cli
Copilot Agent Mode	IDE-native	GPT family	github.com/features/copilot
Cursor	AI IDE	BYOM (Claude, GPT, Gemini...)	cursor.com
Windsurf	AI IDE	Multiple	windsurf.com
Cline	VS Code ext, BYOM	Anything via OpenRouter/Ollama	cline.bot
RooCode	VS Code ext, BYOM	Anything, strong on large multi-file work	roocode.com
Aider	Terminal, BYOM	Anything, git-native	aider.chat
Kilo Code	BYOM, free tiers	Anything, live usage leaderboard	kilo.ai
Continue.dev	VS Code ext, local	Ollama models	continue.dev

Providers cheat sheet

Provider	What it is	When to use
First-party APIs	Direct from each creator	Cheapest per token on one provider
OpenRouter	315+ models, one key	Multi-model access, budget +5-7% overhead
Together AI / Fireworks	Neutral open-model hosts	Open models with fine-tuning, dedicated deploys
Groq	Fast inference	Speed on open models
Morph	bf16, no quantization	Open-model fidelity for codegen (most hosts quantize to fp8 and lose quality)
Bedrock / Vertex / Azure	Cloud resellers	10-20% more per token, but compliance and one cloud bill
Ollama	Local runner	Free, private, offline
Subscriptions	Claude Pro/Max, Codex Plus, GLM Coding Plan, OpenCode Go ($10/mo)	Can beat API billing depending on your usage profile

Three numbers to remember

The 10x cliff. Five models score between 80.2% and 80.6% on SWE-bench Verified (DeepSeek V4 Pro, Gemini 3.1 Pro, MiniMax M3, Qwen3.7 Max, Kimi K2.6), with output prices from $2.40 to $12 per million. The next 8 points up (GPT-5.5, Opus 4.8) cost $25 to $30. The last 8% of quality costs 10x. Know if your work lives in that gap.

The 21.7 point gap. Claude Fable 5 scores 80.3% on SWE-Bench Pro vs 58.6% for GPT-5.5. Frontier benchmarks usually move in single digits. This one did not.

1/36th. DeepSeek V4 Flash delivers about 82% of Opus's SWE-Bench Pro score at roughly 1/36th the input price. It does not replace Opus. It means Opus should stop doing DeepSeek's job.

My routing shortcut

Cheap and low-risk work: DeepSeek V4 Flash
Default coding workhorse: Kimi K2.6 or Sonnet-class
Hard architecture, repeated failures: frontier (Opus 4.8, GPT-5.5, Fable 5 if you can get it)
Private or offline: Qwen local via Ollama
Always: measure cost per successful task, not cost per token. A cheap model that causes rework is expensive.

This space moves monthly. I will update this table when the next shakeup lands. Building with agents and want to compare notes? I am @itseduvieira pretty much everywhere.