Specialty AI Model Catalogs

Catalogs

Choose a model family

Each catalog focuses on one model lineage. Models that already ship in the default catalog are included here too — the full family gives you context about where the defaults sit within their lineage.

🟢

Gemma Family

Google · Apache 2.0 / Gemma ToU

↓ Download

Models

Vision-capable

0.8–19 GB

Model size

The complete Gemma lineage: Gemma 3 (1B–27B, vision from 4B up), Gemma 3n (E2B/E4B text-only), and Gemma 4 (E2B/E4B/12B/26B/31B, all vision-capable). Includes the Gemma 4 26B-A4B MoE and 31B dense models for high-RAM desktop systems. On systems whose only GPU is Intel integrated graphics, Gemma 4 and Gemma 3n models automatically run on the CPU with a bounded working context for stability and predictable response times — dedicated NVIDIA, AMD, and Intel Arc GPUs are unaffected. Requires NotesXML Beta 5.141.84 or later.

Gemma 3 Gemma 3n Gemma 4 3 Desktop-only Apache 2.0 & Gemma ToU

🔶

Mistral Family

Mistral AI · Apache 2.0

↓ Download

Models

Vision-capable

2.1–80 GB

Model size

The broadest family catalog: Ministral 3/8/14B, Mistral 7B, Nemo 12B, Pixtral 12B (vision), Codestral 25.01, Mistral Small 3.1 & 3.2, Devstral Small, Mixtral 8x7B & 8x22B, Magistral Small 24B, and the new Mistral Small 4 119B MoE.

Ministral Mistral Small Mixtral MoE Pixtral 8 Desktop-only

🦙

Llama Family

Meta · Llama Community License

↓ Download

Models

Vision-capable

0.8–243 GB

Model size

From Llama 3.2 1B through Llama 4 Maverick 402B MoE: Llama 3.1 (8B/70B/405B), Llama 3.2 (1B/3B/11B/90B vision), Llama 3.3 70B, Llama 4 Scout 109B-16E, and Llama 4 Maverick 402B-128E. Warning: the 400B+ entries require 250+ GB RAM.

Llama 3.1 Llama 3.2 Llama 3.3 Llama 4 6 Desktop-only

🔷

Phi Family

Microsoft · MIT License

↓ Download

Models

Vision-capable

2.2–24 GB

Model size

Microsoft's Phi line: Phi-3.5 Mini & MoE 42B, Phi-4 14B, Phi-4 Mini, Phi-4 Multimodal 5.6B (vision), Phi-4 Reasoning, Phi-4 Reasoning Plus, Phi-4 Mini Reasoning, and Phi-4 Reasoning Vision 15B (text backbone only). All MIT licensed. Note: all Phi-4 models use hybrid Mamba1/SWA — flash-attn is disabled.

Phi-3.5 Phi-4 Reasoning 5 Desktop-only MIT

🪨

Granite Family

IBM · Apache 2.0

↓ Download

Models

Vision-capable

0.2–20 GB

Model size

IBM's enterprise-grade Granite stack: 3.0 (1B-A400M MoE, 2B, 8B), 3.2 8B with thinking, 4.0 350M (ultra-compact dense), 4.0 H-Small 32B MoE, and the full 4.1 generation (3B, 8B, 30B). All Apache 2.0. Text-only across the board — GGUF vision support not yet available.

Granite 3.0 Granite 3.2 Granite 4.0 Granite 4.1 Text-only Apache 2.0

⚪

GPT-OSS Family

OpenAI · Apache 2.0

↓ Download

Models

Vision-capable

14–63 GB

Model size

OpenAI's first open-weight releases (Aug 2025): GPT-OSS 20B (21B total / 3.6B active MoE) and GPT-OSS 120B (117B total / 5.1B active MoE). Both are reasoning models using the 'harmony' response format. Desktop-only. Minimum 16 GB RAM for 20B; 80+ GB RAM for 120B.

GPT-OSS 20B GPT-OSS 120B Reasoning MoE Desktop-only Apache 2.0

Methodology

Why the default catalog has only 10 models

The default NotesXML catalog is constructed using Pareto-frontier curation: given the trade-off between inference speed (tokens per second) and benchmark quality (AvgScore), only models that are undominated on that frontier ship by default. If model A is both faster and higher quality than model B, model B is removed — no user would choose it.

The specialty catalogs here include every model that was evaluated but fell off the Pareto frontier. They are perfectly functional — they just aren't the optimal choice when faster or higher-quality alternatives exist at similar sizes. You may prefer them for architecture familiarity, licensing requirements, or specific use-cases not captured in the benchmarks.

Benchmark data and the full methodology are published in the AI Benchmarks article.

Specialty Model Catalogs

How to load a specialty catalog

Choose a model family

Gemma Family

Mistral Family

Llama Family

Phi Family

Granite Family

GPT-OSS Family

Why the default catalog has only 10 models

Ready to experiment?