Sakana AI's higher-performance Fugu model — not a monolithic LLM but a learned multi-agent orchestration system that routes requests across multiple models and tools. Multimodal (text+image input), 1M context, $5/$30 per MTok on OpenRouter.
New on HuggingFace: model by LiquidAI with 13 likes.
Brand-new flagship from Zhipu/Z.AI — released on June 20, 2026. The GLM-5 series ships monthly (Apr→May→Jun). Extremely cheap on OpenRouter ($0.0012/$0.0041 per MTok). 1M-token context.
Google's Gemini-3-Pro-Image model ('Nano Banana Pro') — available on OpenRouter since June 18, 2026, $2.00/MTok (input).
Microsoft's specialized code-search model for coding agents — powers the 'Explore' subagent in SWE-FastContext. Not a general-purpose LLM, but relevant for agentic setups.
Google's latest Gemini Flash generation with native image input/output. Available on OpenRouter ($0.50/$3 per MTok). Gemini 3.x is a standalone series alongside Gemini 2.5.
Cohere's code specialist — 30B, open weights, Apache 2.0, 256K context. Free on OpenRouter. Surprisingly strong for its size; Cohere's first genuinely strong open-weights coding model.
Coding-focused offshoot of the K2 series — ~30% fewer thinking tokens than K2.6. Competitive with GPT-5.5 and Claude Opus 4.8 on coding benchmarks. 256K context, MLA attention, 1T-parameter MoE.
Anthropic's new top flagship — replaces the Opus tier as the strongest generally available model. Arena Elo 1508 (rank 1). 1M context, Adaptive Thinking always on, $10/$50 per MTok. Fable 5 + Mythos 5 (invite-only) launched together on June 9, 2026.
NVIDIA's powerful open-weights MoE model: hybrid LatentMoE + MTP layers, 1M context, 550B/55B active. Free on OpenRouter. Focus: complex multi-agent workflows, code, math, science.
397B model from the Shanghai Innovation Institute — open source, Apache 2.0, free on OpenRouter. Agentic-focused, with papers on a 'Unified Ecosystem for Large-Scale Environment Construction'. 7.87K HF likes despite barely any Western coverage.
Qwen's latest Plus tier — 1M context, $0.32/$1.28 per MTok. Qwen3.7 Max (stronger) and Plus (more efficient) form the top of the Qwen3.7 line.
MiniMax is barely known in the West — but its M3 model on OpenRouter ($0.30/$1.20 per MTok) is one of the cheapest 1M-context multimodal providers. Successor to Text-01 (4M context).
DeepSeek's heavyweight: 1.6T total, 49B active, 1M context, FP4/FP8 mixed precision. Codeforces 3206 Elo beats GPT-5.4. MIT license, fully open weights. $0.435/$0.87 per MTok on OpenRouter.
Lean V4 variant: 158B MoE, 1M context, MIT. Extremely cheap on OpenRouter ($0.09/$0.18 per MTok). 2.48M HF downloads in just days signal massive demand.
Current Opus-tier model (until Claude Fable 5). Arena Elo ~1490. 1M context, Adaptive Thinking, knowledge through Jan 2026. $5/$25 per MTok direct, similar on OpenRouter.
201B multimodal open-weights model from StepFun — underrated in the West. Image+text input, strong reasoning. Successor to Step 3.5 Flash (199B). $0.20/$1.15 per MTok on OpenRouter.
Strongest Qwen3.7 variant: 1M context, $1.25/$3.75 per MTok. More capable than Plus, but pricier. The Qwen3.7 line shipped late May 2026 as the successor to Qwen3.6.
xAI's coding-focused model — 256K context, $1/$2 per MTok. 'Build' implies software development as the main use case. A separate coding line alongside the general-purpose Grok 4.3.
Google's current Flash generation: 1M context, all modalities, $1.50/$9 per MTok on OpenRouter. Gemini 3.5 Flash is the speed tier of the Gemini 3.x series that replaces Gemini 2.5.
Multimodal predecessor of K2.7-Code — Visual Agentic Intelligence with 1.1T parameters. 2.66M HF downloads. Still relevant for multimodal tasks (images, video) since K2.7 is code-only.
Poolside's strong coding-agent model: 226B MoE, Apache 2.0, SWE-bench Pro 49.2%. Free on OpenRouter! Specialized for agentic long-horizon coding tasks. Poolside is a little-known US AI company.
Poolside's lean coding model: 33B MoE, 3B active, for local deployments. Free on OpenRouter. Sliding-window attention for very fast inference. 231K downloads on HF.
xAI's current frontier model on OpenRouter ($1.25/$2.50 per MTok). Grok 4.3 positions itself as a strong all-rounder. Also accessible via Grok.com / X Premium.
Mistral ironically calls 119B 'Small' — Apache 2.0, with instruction-following, reasoning and coding in one. Mistral positions itself as the strongest European open-weights challenger.
IBM's Granite 4.x generation: 8B, Apache 2.0, $0.05/$0.10 per MTok (one of the cheapest). Enterprise-focused, strong on structured tasks. IBM ships Granite models consistently with open weights.
Mistral Medium 3.5 is Mistral's first flagship to handle instruction-following, reasoning and coding in a unified way. 418K HF downloads, EAGLE-acceleration variant available. $1.50/$7.50 per MTok on OpenRouter.
Cohere's first multimodal open-weights flagship: 218B MoE, 25B active, 128K context, 48 languages, Apache 2.0. Reasoning tokens, tool use with JSON schema. Not on OpenRouter yet.
Moonshot's 'Visual Agentic Intelligence' — 1.81M HF downloads. Basis for K2.6 and K2.7. One of the first large models with real multimodal agentics at 1T parameters.
NVIDIA's small multimodal reasoning model: 30B MoE, only 3B active, text+image+audio input. Free on OpenRouter. Designed for on-device, 4x faster than its predecessor, reasoning ON/OFF mode.
Fast variant of the Qwen3.6 family: 1M context, $0.19/$1.13 per MTok. Qwen3.6 shipped as Flash, 27B, 35B and Max-preview variants — all on April 28, 2026.
Open-weights MoE variant of the Qwen3.6 generation: 35B total, only 3B active. $0.14/$1 per MTok on OpenRouter. Very efficient for local deployment on consumer hardware.
OpenAI's current frontier model: 'A new class of intelligence for coding and professional work.' 1M context, knowledge cutoff Dec 2025, $5/$30 per MTok. GPT-5.5 Pro as the stronger variant ($30/$180 per MTok).
Strongest available GPT-5.5 variant: $30/$180 per MTok — priced on par with o3. Positioned for enterprise use at the highest demands.
Xiaomi's reasoning model on OpenRouter ($0.435/$0.87 per MTok). MiMo is Xiaomi's first serious LLM push — barely noticed in the West. 1M context. V2.5 (lighter) and V2.5-Pro available.
Tencent's Hunyuan 3 in preview on OpenRouter ($0.063/$0.21 per MTok). One of the cheapest models available. Tencent is a heavyweight that gets little Western attention in AI.
Ant Group's (Alipay's parent) 1-trillion-parameter MoE model — Apache 2.0. Extremely cheap on OpenRouter ($0.075/$0.625 per MTok). 472 likes on HF despite barely any Western coverage.
Xiaomi's standard variant alongside V2.5-Pro: cheaper ($0.14/$0.28 per MTok), 1M context. Xiaomi's entry into the LLM market has gone almost unnoticed in the West — but the model is bookable on OpenRouter.
Ant Group's fast MoE variant: 107B, extremely cheap on OpenRouter ($0.01/$0.03 per MTok — one of the cheapest options anywhere). Ideal for agentic workflows where cost matters.
OpenAI's second image-generation iteration on the GPT-5.4 base: $8/$15 per MTok (input/output). The current standard route for professional AI image generation via OpenRouter API.