The default NotesXML catalog ships 10 Pareto-optimised models. These extended family catalogs let you explore the full breadth of each model lineage — from sub-1 GB ultra-lights to 400B datacenter giants.
⚠️ Specialty catalogs include models that are significantly larger than the defaults. Check the RAM and storage requirements for each model before downloading it inside the app. Large models (20 GB+) are intended for desktop hardware with 32 GB+ RAM.
Each catalog focuses on one model lineage. Models that already ship in the default catalog are included here too — the full family gives you context about where the defaults sit within their lineage.
Google · Apache 2.0 / Gemma ToU
The complete Gemma lineage: Gemma 3 (1B–27B vision), Gemma 3n (E2B/E4B text-only), and Gemma 4 (E2B/E4B/26B/31B vision). Includes the Gemma 4 26B-A4B MoE and 31B dense models for high-RAM desktop systems.
Mistral AI · Apache 2.0
The broadest family catalog: Ministral 3/8/14B, Mistral 7B, Nemo 12B, Pixtral 12B (vision), Codestral 25.01, Mistral Small 3.1 & 3.2, Devstral Small, Mixtral 8x7B & 8x22B, Magistral Small 24B, and the new Mistral Small 4 119B MoE.
Meta · Llama Community License
From Llama 3.2 1B through Llama 4 Maverick 402B MoE: Llama 3.1 (8B/70B/405B), Llama 3.2 (1B/3B/11B/90B vision), Llama 3.3 70B, Llama 4 Scout 109B-16E, and Llama 4 Maverick 402B-128E. Warning: the 400B+ entries require 250+ GB RAM.
Microsoft · MIT License
Microsoft's Phi line: Phi-3.5 Mini & MoE 42B, Phi-4 14B, Phi-4 Mini, Phi-4 Multimodal 5.6B (vision), Phi-4 Reasoning, Phi-4 Reasoning Plus, Phi-4 Mini Reasoning, and Phi-4 Reasoning Vision 15B (text backbone only). All MIT licensed. Note: all Phi-4 models use hybrid Mamba1/SWA — flash-attn is disabled.
IBM · Apache 2.0
IBM's enterprise-grade Granite stack: 3.0 (1B-A400M MoE, 2B, 8B), 3.2 8B with thinking, 4.0 350M (ultra-compact dense), 4.0 H-Small 32B MoE, and the full 4.1 generation (3B, 8B, 30B). All Apache 2.0. Text-only across the board — GGUF vision support not yet available.
OpenAI · Apache 2.0
OpenAI's first open-weight releases (Aug 2025): GPT-OSS 20B (21B total / 3.6B active MoE) and GPT-OSS 120B (117B total / 5.1B active MoE). Both are reasoning models using the 'harmony' response format. Desktop-only. Minimum 16 GB RAM for 20B; 80+ GB RAM for 120B.
The default NotesXML catalog is constructed using Pareto-frontier curation: given the trade-off between inference speed (tokens per second) and benchmark quality (AvgScore), only models that are undominated on that frontier ship by default. If model A is both faster and higher quality than model B, model B is removed — no user would choose it.
The specialty catalogs here include every model that was evaluated but fell off the Pareto frontier. They are perfectly functional — they just aren't the optimal choice when faster or higher-quality alternatives exist at similar sizes. You may prefer them for architecture familiarity, licensing requirements, or specific use-cases not captured in the benchmarks.
Benchmark data and the full methodology are published in the AI Benchmarks article.
Download NotesXML, import a specialty catalog, and run any of these models entirely on your own hardware.