--- name: Ollama slug: ollama type: github source_url: https://github.com/ollama/ollama organization: Ollama organization_slug: ollama total_releases: 104 latest_version: v0.30.10 latest_date: 2026-06-17 last_updated: 2026-06-24 tracking_since: 2025-03-14 canonical: https://releases.sh/ollama/ollama organization_url: https://releases.sh/ollama --- ## What's Changed * Command A and North family models now run on Apple Silicon with the MLX engine * Updated the underlying llama.cpp engine to build 9672 * Fixed build artifacts for MLX **Full Changelog**: https://github.com/ollama/ollama/compare/v0.30.9...v0.30.10 ## What's Changed * Support for Cohere2Moe architecture * Fixed LFM2 parser/render for cases where thinking was not emitted * Fixed issue where `ollama launch claude` and other coding agent or assistant use cases would only output one token * Ollama will now return an error if a single message is larger than the current context window **Full Changelog**: https://github.com/ollama/ollama/compare/v0.30.8...v0.30.9-rc1 ## What's Changed * Fix gemma4:12b floating point exception crash * integrations: hermes windows install by @BruceMacD in https://github.com/ollama/ollama/pull/16487 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.30.4...v0.30.5 ## What's Changed * models: add support for gemma4-12b by @pdevine in https://github.com/ollama/ollama/pull/16457 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.30.2...v0.30.3 ## What's Changed * feat(launch): show and auto-install Cline CLI by @hoyyeva in https://github.com/ollama/ollama/pull/16402 * log template details to aid troubleshooting by @dhiltgen in https://github.com/ollama/ollama/pull/16403 * cmd/launch: add Qwen code integration by @hoyyeva in https://github.com/ollama/ollama/pull/15900 * launch: fix opencode local model limits by @dhiltgen in https://github.com/ollama/ollama/pull/16425 * llm: include cached prompt tokens in llama-server counts by @dhiltgen in https://github.com/ollama/ollama/pull/16428 * Harden app markdown URL handling by @dhiltgen in https://github.com/ollama/ollama/pull/16380 * discover: allow Radeon 8060S iGPU by default by @dhiltgen in https://github.com/ollama/ollama/pull/16429 * llm: detect llama-server load stalls from output by @dhiltgen in https://github.com/ollama/ollama/pull/16427 * More harden app markdown URL handling by @dhiltgen in https://github.com/ollama/ollama/pull/16436 * llama.cpp version update by @dhiltgen in https://github.com/ollama/ollama/pull/16426 * launch: isolate Codex launch configuration by @ParthSareen in https://github.com/ollama/ollama/pull/16437 * llama: add laguna (poolside) arch via a llama.cpp patch under llama/c… by @dhiltgen in https://github.com/ollama/ollama/pull/16396 * docs: configure hermes desktop app by @BruceMacD in https://github.com/ollama/ollama/pull/16440 * llm: ignore llama-server SSE ping comments by @dhiltgen in https://github.com/ollama/ollama/pull/16443 * fix laguna patch build breakage by @dhiltgen in https://github.com/ollama/ollama/pull/16445 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.30.0...v0.30.2-rc0 ## Codex App Ollama 0.24 includes support for the Codex App, OpenAI's desktop experience for working on Codex threads in parallel with built-in worktree support and git functionality. ```bash ollama launch codex-app ``` CleanShot 2026-05-14 at 15 04 18@2x

### Built-in browser Codex can load local servers and sites in its built-in browser, enabling you to directly annotate on the page to request changes. codex-annotate copy

### Review mode Review code inside the app, leave comments, and iterate without leaving your workspace. codex-comments copy 2

### Choosing a model For difficult coding and agentic tasks: - **kimi-k2.6** (with vision support) - **glm-5.1** For local use without an Ollama Cloud subscription: - **nemotron-3-super** - **gemma4:31b** - **qwen3.6** ### Restore anytime To restore the previous configuration of Codex App, run: ```bash ollama launch codex-app --restore ``` ## What's Changed * Reworked the MLX sampler for improved generation quality on Apple Silicon **Full Changelog**: https://github.com/ollama/ollama/compare/v0.23.0...v0.24.0 ## What's Changed * `ollama launch opencode` now supports vision models with image inputs * Fixed formatting of Claude tool results when using local image paths **Full Changelog**: https://github.com/ollama/ollama/compare/v0.23.3...v0.23.4 Ollama 0.30 is now available, with improved compatibility and performance using [llama.cpp](https://github.com/ggml-org/llama.cpp). This augments the MLX engine on Apple Silicon, bringing support to a wider range of hardware. This release brings support for a wider range of models, including GGUF-based models from Hugging Face and your own fine-tuned models along with faster performance on NVIDIA hardware. ## Known issues: * `laguna-xs.2` is not yet supported on Windows/Linux. * `llama3.2-vision` is not yet supported * `nomic-embed-text` now converts inputs to lowercase per the model card where prior Ollama versions incorrectly preserved mixed case ## What's Changed * mlx: refined model push behavior by @dhiltgen in https://github.com/ollama/ollama/pull/15431 * test: integration test hardening by @dhiltgen in https://github.com/ollama/ollama/pull/13532 * app: harden update flows by @dhiltgen in https://github.com/ollama/ollama/pull/16100 * mlx: update the imagegen runner for mlx thread affinity by @pdevine in https://github.com/ollama/ollama/pull/16096 * mlx: avoid status timeout during inference by @dhiltgen in https://github.com/ollama/ollama/pull/16086 * mlx: fix macOS 26 target leakage in v3 metallib by @dhiltgen in https://github.com/ollama/ollama/pull/16053 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.23.2...v0.23.3 ## What's Changed * `ollama launch` no longer includes Claude Desktop due to the third-party integration being limited to Anthropic models. * Use `ollama launch claude-desktop --restore` to restore Claude Desktop to its normal state. * `/api/show` responses are now cached, improving median latency by **~6.7x** which will increase load speed for integrations like VS Code. * Improved backup workflow when managing launch integrations * Cleaner image generation layout in the MLX runner **Full Changelog**: https://github.com/ollama/ollama/compare/v0.23.1...v0.23.2 ## Gemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks. ``` ollama run gemma4:31b-coding-mtp-bf16 ``` ## What's Changed * Update MLX and MLX-C with threading fixes by @dhiltgen in https://github.com/ollama/ollama/pull/15845 * go: bump to 1.26 by @ParthSareen in https://github.com/ollama/ollama/pull/15904 * Add Gemma 4 MTP speculative decoding by @pdevine in https://github.com/ollama/ollama/pull/15980 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.23.0...v0.23.1 ## Claude Desktop Claude Desktop is now supported with Ollama Launch. Claude Cowork and Claude Code are supported within the Claude Desktop App. ``` ollama launch claude-desktop ``` ### Claude Cowork ca1

### Claude Code ca2

Claude Code on the terminal can still be accessed through the CLI with: ``` ollama launch claude ``` ### Not supported yet - Web Search (coming soon) - Extensions ## What's Changed * Launch Claude Desktop with `ollama launch claude-desktop` * The Ollama app now surfaces featured models from server-driven recommendations * Fixed OpenClaw gateway timeout on Windows by enforcing IPv4 loopback (thanks @UniquePratham) * Hardened Metal initialization to gracefully handle ggml kernel compilation failures ## New Contributors * @UniquePratham made their first contribution in https://github.com/ollama/ollama/pull/15726 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.22.1...v0.23.0 ## What's Changed * Updated the **Gemma 4** renderer for thinking and tool calling improvements * Model recommendations are now updated without updating Ollama * Aligned the desktop app's launch page with `ollama launch` integrations * Fixed the Poolside integration title in `ollama launch` **Full Changelog**: https://github.com/ollama/ollama/compare/v0.22.0...v0.22.1 ## New models * NVIDIA's [Nemotron 3 Omni](https://ollama.com/library/nemotron3) * Poolside's first open-weight coding model - [Laguna XS.2](https://ollama.com/library/laguna-xs.2) **Full Changelog**: https://github.com/ollama/ollama/compare/v0.21.2...v0.22.0 ## v0.21.3 ## What's Changed * api: accept "max" as a think value by @ParthSareen in https://github.com/ollama/ollama/pull/15787 * openai: map responses reasoning effort to think by @ParthSareen in https://github.com/ollama/ollama/pull/15789 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.21.2...v0.21.3-rc0 ## What's Changed * Improved reliability of the OpenClaw onboarding flow in `ollama launch` * Recommended models in `ollama launch` now appear in a fixed, canonical order * OpenClaw integration now bundles Ollama's web search plugin in OpenClaw ## New Contributors * @madflow made their first contribution in https://github.com/ollama/ollama/pull/15733 **Full Changelog:** https://github.com/ollama/ollama/compare/v0.21.1...v0.21.2 ## What's Changed ### Kimi CLI You can now install and run the Kimi CLI through Ollama. ``` ollama launch kimi --model kimi-k2.6:cloud ``` Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tasks through a multi-agent system. * **MLX runner adds logprobs support** for compatible models * **Faster MLX sampling** with fused top-P and top-K in a single sort pass, plus repeat penalties applied in the sampler * **Improved MLX prompt tokenization** by moving tokenization into request handler goroutines * **Better MLX thread safety** for array management * **GLM4 MoE Lite performance improvement** with a fused sigmoid router head * **Fixed model picker showing stale model** after switching chats in the macOS app * **Fixed structured outputs for Gemma 4** when `think=false` **Full Changelog**: https://github.com/ollama/ollama/compare/v0.21.0...v0.21.1 ## Hermes Agent ``` ollama launch hermes ``` Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks.

## What's Changed - **Gemma 4 on MLX.** Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs). - **Hermes and GitHub Copilot CLI in `ollama launch`.** Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents. - **OpenCode moved to inline config.** `ollama launch opencode` now writes its config inline rather than to a separate file, matching how other integrations are handled. - **`ollama launch` no longer rewrites config when nothing changed.** Pressing → on a configured multi-model integration, or passing `--model` with the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file and `config.json`. Now it's a no-op when the resolved model list matches what's already saved. - **Fixed `ollama launch openclaw --yes`** so it correctly skips the channels configuration step, so non-interactive setups complete cleanly. - **Restored the Gemma 4 nothink renderer** with the e2b-style prompt. - **Fixed the Gemma 4 compiler error** that was breaking Metal builds. - **Fixed macOS cross-compiles** so they no longer trigger `generate`, which was breaking cmake builds on some Xcode versions. - **Quieted cgo builds** by suppressing deprecated warnings during `go build`. **Full Changelog**: https://github.com/ollama/ollama/compare/v0.20.7...v0.21.0 ## What's Changed * Fix quality of gemma:e2b and gemma:e4b when thinking is disabled * ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in https://github.com/ollama/ollama/pull/15483 **Full Changelog**: https://github.com/ollama/ollama/compare/v0.20.6...v0.20.7 ## What's Changed * Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes * Parallel tool calling improved for streaming responses * [Hermes agent](https://docs.ollama.com/integrations/hermes) Ollama integration guide is now available * Ollama app is updated to fix image attachment errors ## New Contributors @matteocelani made their first contribution in [#15272](https://github.com/ollama/ollama/pull/15272) **Full Changelog**: https://github.com/ollama/ollama/compare/v0.20.5...v0.20.6