What's Changed
- ggml: force flash attention off for grok by @rick-github in https://github.com/ollama/ollama/pull/15050
- mlx: fix KV cache snapshot memory leak by @jessegross in https://github.com/ollama/ollama/pull/15065
- mlxrunner: schedule periodic snapshots during prefill by @jessegross in https://github.com/ollama/ollama/pull/15058
- doc: update vscode doc by @hoyyeva in https://github.com/ollama/ollama/pull/15064
Full Changelog: https://github.com/ollama/ollama/compare/v0.18.3...v0.18.4-rc0
Fetched May 26, 2026

