v1.40.0
- Core: Fix
--yolomode unintentionally preventing the model from callingAskUserQuestion— yolo used to inject a system reminder telling the model it was in "non-interactive mode" and must not ask, and the ask-user tool auto-dismissed in yolo. Both were wrong: yolo only bypasses permission approvals; it does not mean "the user is gone". Yolo no longer injects model guidance, and the user remains reachable throughAskUserQuestion - CLI: Split permission bypass from unattended execution —
--yolobypasses permission approvals while the user is still at the terminal, while--afk//afkmeans away-from-keyboard:AskUserQuestionis auto-dismissed and approvals are handled automatically.--printnow uses runtime AFK behavior instead of yolo, matching its non-interactive execution model. The status bar showsyoloandafkindependently, and/yoloand/afktoggle their own flag without disturbing the other - Config: Replace
skip_yolo_prompt_injectionwithskip_afk_prompt_injectionnow that yolo no longer injects model guidance. The old config key is ignored if present - Shell: Fix
/yolotoggling producing misleading UI messages when afk is also active —/yoloused to read the combined auto-approval state, so pressing it under afk would claim approval was now required even though afk still handled approvals automatically./yolonow reads and writes only the yolo flag, leaving afk alone - Web: Fix AI title generation overwriting a manually-set title when the LLM call finishes after the user has already renamed the session — the final write now reloads state and yields to a
title_generatedflag set by another request - Web: Surface session rename, archive, unarchive, and title generation failures as toast notifications instead of only logging to the console
- Web: Keep tool media previews visible when tool details are collapsed — images and videos returned by tools now render below the tool card instead of inside the collapsible detail area, so preview thumbnails remain accessible after collapsing a tool
- Kosong: Fix stale API key after OAuth token refresh in Kimi provider —
on_retryable_errornow reads the currentapi_keyfrom the live client instead of the cached_api_key, so that OAuth token refreshes applied viaclient.api_keyare preserved when the client is rebuilt after a retryable error - Core: Approval requests no longer auto-timeout after 5 minutes, which previously surfaced as
Rejected by user; active foreground and subagent approvals now wait indefinitely for user response - Shell: Fix
/usageremaining quota rendering — the progress bar, warning colors, and% leftlabel now all use the remaining quota ratio consistently, so high remaining quota shows as green/full and near-exhausted quota shows as yellow or red - Shell: Show active background agent task count in the prompt status bar — the existing
⚙ bash: Nbadge only counted background Shell tasks and filtered out background Agent subagents, so when many subagents were running the prompt looked idle and users could not tell work was in progress; the toolbar now renders⚙ bash: Nand⚙ agent: Nas two independent badges (each hidden when its count is 0) and drops the agent badge first when the terminal is too narrow to fit both - Auth: Fix managed model list refresh silently failing for OAuth users with expired tokens — the background
/modelssync now detects 401 responses, forces an OAuth token refresh, and retries with the refreshed token; if the refresh fails or the refreshed token is still rejected, it falls back to the originally configured static API key instead of skipping the provider - Core: Fix connection recovery not triggering OAuth refresh when the retry returns 401 — after recreating the HTTP client on
APIConnectionErrororAPITimeoutError, the retry now re-enters the full recovery path so a subsequent 401 correctly refreshes the OAuth token instead of bubbling to the user as an unrecoverable error - Shell: Echo
/skill:*and/flow:*inputs in the transcript so workflow commands stay visible after enter; operational slash commands like/usageand/modelremain hidden - Core: Raise default
max_steps_per_turnfrom 500 to 1000 so long-running agents are less likely to hit the per-turn limit
Fetched June 4, 2026


