Token Optimization

Quick Wins
Model Selection
When to Switch to Opus
Thinking Token Limits
Reduce Thinking Tokens
Auto-Compaction Strategy
Compact Earlier
Manual Compaction
When to Compact
When NOT to Compact
Context Window Management
MCP Best Practices
Check Active Tools
Subagent Model Selection
Daily Workflow Commands
Example Workflow
Agent Teams Warning
Cost Monitoring
Check Current Usage
Track Over Time
Strategic Compaction Skill
Compaction Decision Tree
Summary: Optimal Settings
Cost Reduction Checklist

Claude Code usage can be expensive if you don’t manage token consumption. These settings significantly reduce costs without sacrificing quality.

Quick Wins

Add to ~/.claude/settings.json:

{
  "model": "sonnet",
  "env": {
    "MAX_THINKING_TOKENS": "10000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

Model Selection

Sonnet for Daily Development

Use Sonnet as your default model. It handles 80%+ of coding tasks and costs ~60% less than Opus.

Setting	Default	Recommended	Impact
`model`	opus	sonnet	~60% cost reduction
Task Coverage	-	80%+	Most coding tasks

When to Switch to Opus

/model opus

Use Opus only for:

Complex architectural decisions
Deep debugging sessions
Multi-system refactoring
First-principles problem solving

Switch back after the complex task:

/model sonnet

Thinking Token Limits

Claude’s “thinking” happens behind the scenes and consumes tokens you don’t see.

Hidden Cost: Extended thinking defaults to 31,999 tokens per request. At scale, this is your largest cost driver.

Reduce Thinking Tokens

{
  "env": {
    "MAX_THINKING_TOKENS": "10000"
  }
}

Impact: ~70% reduction in hidden thinking cost per request. Most coding tasks don’t need 32k thinking tokens. 10k is sufficient for:

Code review
Bug fixes
Feature implementation
Refactoring

Only raise the limit for:

Large-scale architecture decisions
Complex debugging across many files

Auto-Compaction Strategy

Context windows fill up during long sessions. Claude auto-compacts at 95% by default, but this is too late.

Compact Earlier

{
  "env": {
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50"
  }
}

Why 50%?

Better quality in long sessions
Prevents context degradation
More aggressive cleanup of irrelevant context

Compaction at 95% means you’ve already filled 190k of your 200k window. Compacting at 50% gives Claude more room to work.

Manual Compaction

Use /compact at logical breakpoints instead of relying on auto-compaction.

When to Compact

After Research Phase

You’ve explored the codebase, found what you need. Compact before implementing.

/compact

After Milestone Completion

Feature is done, tests pass. Compact before starting the next feature.

After Debugging

Bug is fixed. Compact to clear investigation context before continuing.

After Failed Approach

Dead end reached. Compact to clear failed attempt before trying new approach.

When NOT to Compact

Don’t compact mid-implementation. You’ll lose:

Variable names and function signatures
File paths you’re working with
Partial state and context

Context Window Management

Each MCP tool description consumes tokens from your 200k window.

Critical: Too many MCPs can reduce your effective window from 200k to ~70k.

MCP Best Practices

// In project .claude/settings.json
{
  "disabledMcpServers": ["supabase", "railway", "vercel"]
}

Limits:

Keep under 10 MCPs enabled per project
Keep under 80 tools active total
Disable unused MCPs in project config

Check Active Tools

/mcp list

Disable any you’re not actively using.

Subagent Model Selection

Subagents handle delegated tasks. Use Haiku for routine work.

{
  "env": {
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

Haiku is sufficient for:

Code review (code-reviewer agent)
Build error resolution (build-error-resolver agent)
Documentation updates (doc-updater agent)
Test generation (tdd-guide agent for simple cases)

Use Sonnet/Opus subagents for:

Complex architecture (architect agent)
Security audits (security-reviewer agent)
Multi-file refactoring

Daily Workflow Commands

Command	When to Use	Cost Impact
`/model sonnet`	Default for most tasks	60% cheaper than Opus
`/model opus`	Complex architecture, deep debugging	Full cost, use sparingly
`/clear`	Between unrelated tasks	Free instant reset
`/compact`	Logical task breakpoints	Reduces context, improves quality
`/cost`	Monitor spending	Visibility into token usage

Example Workflow

Start Session

/model sonnet

Default model for daily work.

Implement Feature

Use agents and commands normally. Sonnet handles most tasks.

Hit Complex Problem

/model opus

Switch to Opus for deep architectural decisions.

Complete Complex Task

/model sonnet
/compact

Switch back to Sonnet and compact to clear context.

New Unrelated Task

/clear

Free instant reset between unrelated tasks.

Agent Teams Warning

Agent Teams = Multiple Context Windows. Each teammate consumes tokens independently.

Only use Agent Teams when:

Parallelism provides clear value (multi-module work)
Parallel reviews (security + code quality)

For sequential tasks, use subagents instead:

/plan → planner agent (single context)
/code-review → code-reviewer agent (single context)

Cost Monitoring

Check Current Usage

/cost

Shows token consumption for current session.

Track Over Time

Monitor your Claude Code dashboard:

Daily usage trends
Per-project costs
Model distribution (Opus vs Sonnet)

Target: 80%+ Sonnet usage, <20% Opus usage for optimal cost/performance.

Strategic Compaction Skill

ECC includes a strategic-compact skill that suggests /compact at logical breakpoints. See skills/strategic-compact/SKILL.md for the full decision guide.

Compaction Decision Tree

Completed research/exploration?
  → YES: /compact (clear research context)
  → NO: Continue

Milestone complete (feature done, tests pass)?
  → YES: /compact (clear before next feature)
  → NO: Continue

Debugging complete?
  → YES: /compact (clear investigation context)
  → NO: Continue

Failed approach, trying new direction?
  → YES: /compact (clear failed attempt)
  → NO: Continue

Mid-implementation?
  → NO COMPACTION (preserve working context)

Summary: Optimal Settings

{
  "model": "sonnet",
  "env": {
    "MAX_THINKING_TOKENS": "10000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

Cost Reduction Checklist

Expected Savings: 60-70% cost reduction with these optimizations applied.

OpenCode Support

Continuous Learning v2

⌘I

Get Started

Core Concepts

Platform Support

Guides

Quick Wins

Model Selection

Sonnet for Daily Development

When to Switch to Opus

Thinking Token Limits

Reduce Thinking Tokens

Auto-Compaction Strategy

Compact Earlier

Manual Compaction

When to Compact

When NOT to Compact

Context Window Management

MCP Best Practices

Check Active Tools

Subagent Model Selection

Daily Workflow Commands

Example Workflow

Agent Teams Warning

Cost Monitoring

Check Current Usage

Track Over Time

Strategic Compaction Skill

Compaction Decision Tree

Summary: Optimal Settings

Cost Reduction Checklist

Get Started

Core Concepts

Platform Support

Guides

Documentation Index

​Quick Wins

​Model Selection

Sonnet for Daily Development

​When to Switch to Opus

​Thinking Token Limits

​Reduce Thinking Tokens

​Auto-Compaction Strategy

​Compact Earlier

​Manual Compaction

​When to Compact

​When NOT to Compact

​Context Window Management

​MCP Best Practices

​Check Active Tools

​Subagent Model Selection

​Daily Workflow Commands

​Example Workflow

​Agent Teams Warning

​Cost Monitoring

​Check Current Usage

​Track Over Time

​Strategic Compaction Skill

​Compaction Decision Tree

​Summary: Optimal Settings

​Cost Reduction Checklist

Quick Wins

Model Selection

When to Switch to Opus

Thinking Token Limits

Reduce Thinking Tokens

Auto-Compaction Strategy

Compact Earlier

Manual Compaction

When to Compact

When NOT to Compact

Context Window Management

MCP Best Practices

Check Active Tools

Subagent Model Selection

Daily Workflow Commands

Example Workflow

Agent Teams Warning

Cost Monitoring

Check Current Usage

Track Over Time

Strategic Compaction Skill

Compaction Decision Tree

Summary: Optimal Settings

Cost Reduction Checklist