v0.19.0

Ollama is now powered by MLX on Apple Silicon in preview

Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture.

Ollama's app will now no longer incorrectly show "model is out of date"
ollama launch pi now includes web search plugin that uses Ollama's web search
Improved KV cache hit rate when using the Anthropic-compatible API
Fixed tool call parsing issue with Qwen3.5 where tool calls would be output in thinking
MLX runner will now create periodic snapshots during prompt processing
Fixed KV cache snapshot memory leak in MLX runner
Fixed issue where flash attention would be incorrectly enabled for grok models
Fixed qwen3-next:80b not loading in Ollama

Full Changelog: https://github.com/ollama/ollama/compare/v0.18.3...v0.19.0