v0.21.1
What's Changed
Kimi CLI
You can now install and run the Kimi CLI through Ollama.
ollama launch kimi --model kimi-k2.6:cloud
Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system.
- MLX runner adds logprobs support for compatible models
- Faster MLX sampling with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler
- Improved MLX prompt tokenization by moving tokenization into request handler goroutines
- Better MLX thread safety for array management
- GLM4 MoE Lite performance improvement with a fused sigmoid router head
- Fixed model picker showing stale model after switching chats in the macOS app
- Fixed structured outputs for Gemma 4 when
think=false
Full Changelog: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1
Fetched May 26, 2026
