v0.20.4
What's Changed
- mlx: Improve M5 performance with NAX
- gemma4: enable flash attention
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.3...v0.20.4
Fetched May 26, 2026
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.3...v0.20.4
Fetched May 26, 2026
What's Changed mlx: refined model push behavior by @dhiltgen in https://github.com/ollama/ollama/pull/15431 test: integration test hardenin…
OllamaGemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x…
Ollama