releases.shpreview
Ollama/Ollama/v0.19.0

v0.19.0

<img width="480" alt="image" src="https://github.com/user-attachments/assets/1b5ca980-b9d5-490e-99b9-f0f7b9af2c32" />

Ollama is now powered by MLX on Apple Silicon in preview

Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture.

https://github.com/user-attachments/assets/600297b0-3167-46a5-8e3a-fefda3a51b84

Read more: https://ollama.com/blog/mlx

What's Changed

  • Ollama's app will now no longer incorrectly show "model is out of date"
  • ollama launch pi now includes web search plugin that uses Ollama's web search
  • Improved KV cache hit rate when using the Anthropic-compatible API
  • Fixed tool call parsing issue with Qwen3.5 where tool calls would be output in thinking
  • MLX runner will now create periodic snapshots during prompt processing
  • Fixed KV cache snapshot memory leak in MLX runner
  • Fixed issue where flash attention would be incorrectly enabled for grok models
  • Fixed qwen3-next:80b not loading in Ollama

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.3...v0.19.0

Fetched May 26, 2026