Claude Opus 4.1 model (claude-opus-4-1-20250805) is being deprecated with retirement scheduled for August 5, 2026. Users are recommended to migrate to Claude Opus 4.8.
Claude Platform
- The advisor tool now supports a
max_tokensparameter to cap the advisor model's output per call, reducing latency and output token cost for workloads that don't need full-length responses. - On the Claude API, requests returning
stop_reason: "refusal"without generated output are no longer billed.
Claude Managed Agents webhooks, multiagent orchestration, and self-hosted sandboxes are now available on Claude Platform on AWS. New IAM actions and AnthropicSelfHostedEnvironmentAccess managed policy added.
- In Claude Code, Auto mode expanded to more users for long-running tasks.
- Max plan users now default to fast mode on Claude Opus 4.8.
- Workflows available as research preview for defining and running multi-step agentic plans.
- Fast mode deprecated for Claude Opus 4.6, with removal ~30 days after launch.
Claude Opus 4.8 is now available as the most capable generally available model. Key features and updates:
- 1M token context window by default on Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry)
- 128k max output tokens
- Mid-conversation system messages: Send
role: "system"messages after user turns to preserve prompt cache hits when instructions change - Enhanced stop_details field on refusal responses with
categoryandexplanation - Effort parameter defaults to
highacross all surfaces - Minimum cacheable prompt length for prompt caching reduced to 1,024 tokens
- Adaptive thinking triggers reasoning only when needed, reducing wasted tokens
- High-resolution image input support (up to 2576 pixels on long edge)
- Task budgets, advisor tool, and computer use all supported
- Fast mode available as research preview on Claude API only
- Sampling parameters (
temperature,top_p,top_k) set to non-default values return 400 error - Claude Code: Auto mode expanded for long-running tasks; Max plan users default to fast mode; Workflows available as research preview
- Fast mode for Claude Opus 4.6 deprecated (removal ~30 days after launch)
The Messages API response now includes usage.output_tokens_details.thinking_tokens, reporting how many of the billed output tokens were extended thinking. When streaming, the breakdown appears only on the final message_delta event. No beta header is required.
MCP tunnels is now available as a Research Preview, allowing you to connect to MCP servers in your private network. Self-hosted sandboxes are now available for Claude Managed Agents as an alternative to running tool execution in Anthropic's infrastructure. Claude Managed Agents now support updating an agent's MCP server and tool configurations associated with an active session. Large outputs from agent_toolset and MCP tools exceeding 100K tokens are now automatically spilled to a file in the sandbox, with the model receiving a truncated preview.
The web search tool now returns richer SEC filing data, making it easier to ground financial research agents, earnings analysis, and due-diligence workflows in primary sources with citations.
Cache diagnostics is now available in public beta. Pass diagnostics.previous_message_id on a Messages request and the API reports a cache_miss_reason explaining where the prompt cache prefix diverged from the previous turn. Include the cache-diagnosis-2026-04-07 beta header in your requests.
Fast mode (research preview) now supports Claude Opus 4.7. Set speed: "fast" with model: "claude-opus-4-7" and the fast-mode-2026-02-01 beta header for significantly faster output token generation at premium pricing. Pricing, rate limits, and access are the same as for Opus 4.6 fast mode; interested customers should join the waitlist.
Claude Platform on AWS brings the Claude API to Anthropic-managed infrastructure accessible through AWS, with AWS billing and IAM authentication. Access the full Messages API, Files API, Message Batches API, Claude Managed Agents, Agent Skills, code execution, and tool use through native AWS endpoints.
Multiagent sessions and Outcomes are now in public beta under the standard managed-agents-2026-04-01 beta header. Vault credential background refresh is now supported for mcp_oauth credentials. Webhooks for Claude Managed Agents are now supported, with event types including session and vault lifecycle events. Additional filtering and sorting options are now supported—sessions can be filtered by status, and events can be filtered by type and creation time.
The 1M token context window beta (context-1m-2025-08-07) has been retired for Claude Sonnet 4.5 and Claude Sonnet 4. The beta header now has no effect on these models, and requests exceeding the standard 200k-token context window return an error. To use the 1M context window, migrate to Claude Sonnet 4.6 or Claude Opus 4.6, where it's generally available at standard pricing with no beta header required.
The Rate Limits API is now available, allowing administrators to programmatically query the rate limits configured for their organization and workspaces.
Released the Rate Limits API, allowing administrators to programmatically query the rate limits configured for their organization and workspaces.
Memory for Claude Managed Agents is now in public beta under the standard managed-agents-2026-04-01 header. See Using agent memory for the full integration guide.
We've retired the Claude Haiku 3 model (claude-3-haiku-20240307). All requests to this model will now return an error. We recommend upgrading to Claude Haiku 4.5.
Claude Opus 4.7 has been launched as the most capable generally available model for complex reasoning and agentic coding, at the same $5 / $25 per MTok pricing as Opus 4.6. Includes API breaking changes versus Opus 4.6. Claude in Amazon Bedrock is now open to all Amazon Bedrock customers, with Claude Opus 4.7 and Claude Haiku 4.5 available self-serve from the Bedrock console through the Messages API endpoint at /anthropic/v1/messages, in 27 AWS regions.
Claude Opus 4.7 is now available as the most capable generally available model for complex reasoning and agentic coding.
- Same pricing as Opus 4.6: $5 / $25 per MTok
- Enhanced capability improvements
- Updated tokenizer
- API breaking changes versus Opus 4.6; see migration guide before upgrading
- Claude in Amazon Bedrock now open to all Amazon Bedrock customers with Claude Opus 4.7 and Claude Haiku 4.5 available self-serve in 27 AWS regions
Claude Opus 4.7 is now available as the most capable generally available model for complex reasoning and agentic coding, at the same $5/$25 per MTok pricing as Opus 4.6. Includes capability improvements, new features, and an updated tokenizer. Note: Contains API breaking changes versus Opus 4.6 — see migration guide before upgrading.



