v0.20.1
What's Changed
- bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in https://github.com/ollama/ollama/pull/15158
- model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in https://github.com/ollama/ollama/pull/15254
- ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in https://github.com/ollama/ollama/pull/15301
- gemma4: enable flash attention by @dhiltgen in https://github.com/ollama/ollama/pull/15296
- ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in https://github.com/ollama/ollama/pull/15305
- model/parsers: rework gemma4 tool call handling by @drifkin in https://github.com/ollama/ollama/pull/15306
Full Changelog: https://github.com/ollama/ollama/compare/v0.20.0...v0.20.1
Fetched May 26, 2026

