Claude Opus 4.6: What's New — Adaptive Thinking, 1M Context & More

Q: How much does Claude Opus 4.6 cost?

Claude Opus 4.6 starts at $5 per million input tokens and $25 per million output tokens. Requests exceeding 200K tokens use the 1M context beta at 2x input and 1.5x output pricing. US-only inference adds a 1.1x multiplier.

Q: What is adaptive thinking in Claude Opus 4.6?

Adaptive thinking replaces the budget_tokens parameter with a simpler effort parameter (low, medium, high). Instead of setting a fixed thinking token cap, Claude dynamically decides how much reasoning each task needs, improving both quality and cost efficiency.

Q: What is the Claude Opus 4.6 context window?

Claude Opus 4.6 supports up to 1 million tokens in beta, enough to process entire codebases or dozens of research papers in a single request. The standard 200K context window remains at base pricing.

What's Shipping Today

Claude Opus 4.6 — Anthropic's most intelligent model, optimized for coding, agents, and professional work
1M token context (beta) — Process entire codebases in a single request
Adaptive thinking — Replaces budget_tokens with a simpler effort parameter
Context compaction (beta) — Automatic summarization when approaching context limits
US-only inference — Guaranteed US processing at 1.1x pricing

Anthropic released Claude Opus 4.6 today — their most intelligent model to date, positioned as the world's best for coding, enterprise agents, and professional work. This isn't an incremental update. There are meaningful architectural changes that affect how you build with Claude's API.

We've been building on the Claude API since day one (ClaudeArchitect runs on it), so here's our breakdown of what actually matters, what you should migrate, and what's just marketing.

Pricing: What It Costs

Opus 4.6 is premium-tier, priced above Sonnet but reflecting significantly higher capability. Here's the full picture:

Tier	Input (per 1M tokens)	Output (per 1M tokens)	Notes
Standard (up to 200K)	$5	$25	Base pricing
Extended context (200K-1M)	$10	$37.50	2x input / 1.5x output
US-only inference	$5.50	$27.50	1.1x multiplier

For comparison, Claude Sonnet 4.5 runs at $3/$15 per million tokens. Opus 4.6 is roughly 1.7x the price for input and output — a meaningful step up, but the capability gap is significant, especially for complex reasoning tasks.

Adaptive Thinking: The Big Migration

This is the change that requires attention. Extended thinking previously used budget_tokens — a hard cap on how many tokens Claude could spend reasoning. The problem: set it too low, and complex tasks got truncated reasoning. Set it too high, and you burned tokens on simple questions.

Adaptive thinking replaces this with an effort parameter that lets Claude decide how much thinking is appropriate:

# Old way — will be deprecated in future releases
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=8000,
    thinking={
        "type": "enabled",
        "budget_tokens": 4000  # Fixed cap
    },
    messages=[...]
)

# New way — Claude decides how much thinking is needed
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=8000,
    thinking={
        "type": "enabled",
        "effort": "medium"  # low | medium | high
    },
    messages=[...]
)

The key insight: A simple "summarize this paragraph" gets minimal thinking. A complex multi-step reasoning problem gets deep analysis. You're no longer guessing the right token budget — you're telling Claude how hard to try.

Migration urgency: budget_tokens still works on Opus 4.6, but Anthropic has explicitly said it will be retired in future model releases. If you're using extended thinking, plan your migration now.

1M Token Context Window (Beta)

The standard 200K context window jumps to 1M tokens in beta. To put that in perspective:

200K tokens ≈ a mid-sized codebase or 3-4 research papers
1M tokens ≈ an entire large codebase, or 15-20 research papers, or a full book

The pricing structure is important: requests stay at base pricing up to 200K. Beyond that, you pay 2x on input and 1.5x on output. A request using 500K input tokens costs effectively $10/M — still reasonable for scenarios where you genuinely need the full context.

Best use cases: Codebase-wide refactoring analysis, large document synthesis, comprehensive research review, and long-running agent conversations that accumulate context over many turns.

Context Compaction (Beta)

This is particularly relevant for agent-based applications. When a conversation approaches the context limit, Claude automatically summarizes older context to make room for new information. Think of it as intelligent memory management.

Instead of hitting a wall at your context limit and losing the conversation, the model gracefully compresses earlier turns while preserving the most important information. For tools like ClaudeArchitect — where orchestration can involve many specialist calls in a single session — this is a meaningful quality-of-life improvement.

The practical impact: Longer agent sessions without context window errors. More reliable multi-step workflows. Less need for manual conversation summarization in your application code.

US-Only Inference

For workloads with data residency requirements, Anthropic now offers guaranteed US-based processing at a 1.1x pricing multiplier on both input and output tokens. This is relevant for enterprises handling sensitive data, healthcare applications under HIPAA, or government-adjacent work.

At 10% premium, it's reasonably priced for the compliance guarantee. Most consumer applications won't need this, but it's good to know it exists.

What This Means for AI-Powered Products

We build on the Claude API daily, and here's our honest assessment:

Immediate wins

Higher quality reasoning — Opus 4.6 is measurably smarter than previous models for complex, multi-step tasks
Context compaction — Long agent sessions become more reliable without engineering workarounds
Adaptive thinking — Better cost efficiency: simple tasks use fewer thinking tokens automatically

Things to watch

Cost at scale — $5/$25 per million is premium pricing. Run the math on your volume before upgrading from Sonnet
1M context pricing — The 2x/1.5x multiplier above 200K adds up fast. Use it strategically, not by default
budget_tokens deprecation — This will break eventually. Migrate to effort proactively

Built on Claude's Latest Intelligence

ClaudeArchitect uses Claude's API to power specialist agents for business documents, content creation, media production, and more.

Try ClaudeArchitect Free

Frequently Asked Questions

What is Claude Opus 4.6?

Claude Opus 4.6 is Anthropic's most intelligent AI model, released February 6, 2026. It's designed for coding, enterprise agents, and professional work, with a model ID of claude-opus-4-6.

How much does Claude Opus 4.6 cost?

$5 per million input tokens and $25 per million output tokens at base pricing. The 1M context beta charges 2x input and 1.5x output for requests exceeding 200K tokens. US-only inference adds a 1.1x multiplier.

What is adaptive thinking?

Adaptive thinking replaces the budget_tokens parameter with effort (low, medium, high). Instead of setting a fixed thinking token cap, Claude dynamically decides how much reasoning each task needs. The old parameter still works on Opus 4.6 but will be removed in future releases.

Should I upgrade from Claude Sonnet to Opus 4.6?

It depends on your use case. Opus 4.6 excels at complex reasoning, multi-step agent workflows, and coding tasks. For simpler text generation, Sonnet 4.5 at $3/$15 per million tokens may be more cost-effective. Evaluate based on task complexity and volume.

What is context compaction?

Context compaction automatically summarizes older context when a conversation approaches the token limit. This extends effective conversation length without hitting hard context window errors — particularly useful for agent-based applications running multi-step tasks.

Claude Opus 4.6: What's New and What It Means for AI-Powered Products