v0.21.0

Hermes Agent

ollama launch hermes

Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks.

Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs).
Hermes and GitHub Copilot CLI in ollama launch. Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents.
OpenCode moved to inline config. ollama launch opencode now writes its config inline rather than to a separate file, matching how other integrations are handled.
ollama launch no longer rewrites config when nothing changed. Pressing → on a configured multi-model integration, or passing --model with the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file and config.json. Now it's a no-op when the resolved model list matches what's already saved.
Fixed ollama launch openclaw --yes so it correctly skips the channels configuration step, so non-interactive setups complete cleanly.
Restored the Gemma 4 nothink renderer with the e2b-style prompt.
Fixed the Gemma 4 compiler error that was breaking Metal builds.
Fixed macOS cross-compiles so they no longer trigger generate, which was breaking cmake builds on some Xcode versions.
Quieted cgo builds by suppressing deprecated warnings during go build.

Full Changelog: https://github.com/ollama/ollama/compare/v0.20.7...v0.21.0