releases.shpreview

Advisor tool max_tokens parameter and refusal billing changes

  • The advisor tool now supports a max_tokens parameter to cap the advisor model's output per call, reducing latency and output token cost for workloads that don't need full-length responses.
  • On the Claude API, requests returning stop_reason: "refusal" without generated output are no longer billed.

Fetched June 4, 2026