v0.21.0
Hermes Agent
ollama launch hermes
Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks.
<img width="1329" height="946" alt="image" src="https://github.com/user-attachments/assets/771d3383-95ed-4652-81e5-cf89514d25cc" />What's Changed
- Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs).
- Hermes and GitHub Copilot CLI in
ollama launch. Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents. - OpenCode moved to inline config.
ollama launch opencodenow writes its config inline rather than to a separate file, matching how other integrations are handled. ollama launchno longer rewrites config when nothing changed. Pressing → on a configured multi-model integration, or passing--modelwith the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file andconfig.json. Now it's a no-op when the resolved model list matches what's already saved.- Fixed
ollama launch openclaw --yesso it correctly skips the channels configuration step, so non-interactive setups complete cleanly. - Restored the Gemma 4 nothink renderer with the e2b-style prompt.
- Fixed the Gemma 4 compiler error that was breaking Metal builds.
- Fixed macOS cross-compiles so they no longer trigger
generate, which was breaking cmake builds on some Xcode versions. - Quieted cgo builds by suppressing deprecated warnings during
go build.
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.7...v0.21.0
Fetched May 26, 2026


