Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/affaan-m/everything-claude-code/llms.txt

Use this file to discover all available pages before exploring further.

Claude Code usage can be expensive if you don’t manage token consumption. These settings significantly reduce costs without sacrificing quality.

Quick Wins

Add to ~/.claude/settings.json:
{
  "model": "sonnet",
  "env": {
    "MAX_THINKING_TOKENS": "10000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

Model Selection

Sonnet for Daily Development

Use Sonnet as your default model. It handles 80%+ of coding tasks and costs ~60% less than Opus.
SettingDefaultRecommendedImpact
modelopussonnet~60% cost reduction
Task Coverage-80%+Most coding tasks

When to Switch to Opus

/model opus
Use Opus only for:
  • Complex architectural decisions
  • Deep debugging sessions
  • Multi-system refactoring
  • First-principles problem solving
Switch back after the complex task:
/model sonnet

Thinking Token Limits

Claude’s “thinking” happens behind the scenes and consumes tokens you don’t see.
Hidden Cost: Extended thinking defaults to 31,999 tokens per request. At scale, this is your largest cost driver.

Reduce Thinking Tokens

{
  "env": {
    "MAX_THINKING_TOKENS": "10000"
  }
}
Impact: ~70% reduction in hidden thinking cost per request. Most coding tasks don’t need 32k thinking tokens. 10k is sufficient for:
  • Code review
  • Bug fixes
  • Feature implementation
  • Refactoring
Only raise the limit for:
  • Large-scale architecture decisions
  • Complex debugging across many files

Auto-Compaction Strategy

Context windows fill up during long sessions. Claude auto-compacts at 95% by default, but this is too late.

Compact Earlier

{
  "env": {
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50"
  }
}
Why 50%?
  • Better quality in long sessions
  • Prevents context degradation
  • More aggressive cleanup of irrelevant context
Compaction at 95% means you’ve already filled 190k of your 200k window. Compacting at 50% gives Claude more room to work.

Manual Compaction

Use /compact at logical breakpoints instead of relying on auto-compaction.

When to Compact

1

After Research Phase

You’ve explored the codebase, found what you need. Compact before implementing.
/compact
2

After Milestone Completion

Feature is done, tests pass. Compact before starting the next feature.
3

After Debugging

Bug is fixed. Compact to clear investigation context before continuing.
4

After Failed Approach

Dead end reached. Compact to clear failed attempt before trying new approach.

When NOT to Compact

Don’t compact mid-implementation. You’ll lose:
  • Variable names and function signatures
  • File paths you’re working with
  • Partial state and context

Context Window Management

Each MCP tool description consumes tokens from your 200k window.
Critical: Too many MCPs can reduce your effective window from 200k to ~70k.

MCP Best Practices

// In project .claude/settings.json
{
  "disabledMcpServers": ["supabase", "railway", "vercel"]
}
Limits:
  • Keep under 10 MCPs enabled per project
  • Keep under 80 tools active total
  • Disable unused MCPs in project config

Check Active Tools

/mcp list
Disable any you’re not actively using.

Subagent Model Selection

Subagents handle delegated tasks. Use Haiku for routine work.
{
  "env": {
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}
Haiku is sufficient for:
  • Code review (code-reviewer agent)
  • Build error resolution (build-error-resolver agent)
  • Documentation updates (doc-updater agent)
  • Test generation (tdd-guide agent for simple cases)
Use Sonnet/Opus subagents for:
  • Complex architecture (architect agent)
  • Security audits (security-reviewer agent)
  • Multi-file refactoring

Daily Workflow Commands

CommandWhen to UseCost Impact
/model sonnetDefault for most tasks60% cheaper than Opus
/model opusComplex architecture, deep debuggingFull cost, use sparingly
/clearBetween unrelated tasksFree instant reset
/compactLogical task breakpointsReduces context, improves quality
/costMonitor spendingVisibility into token usage

Example Workflow

1

Start Session

/model sonnet
Default model for daily work.
2

Implement Feature

Use agents and commands normally. Sonnet handles most tasks.
3

Hit Complex Problem

/model opus
Switch to Opus for deep architectural decisions.
4

Complete Complex Task

/model sonnet
/compact
Switch back to Sonnet and compact to clear context.
5

New Unrelated Task

/clear
Free instant reset between unrelated tasks.

Agent Teams Warning

Agent Teams = Multiple Context Windows. Each teammate consumes tokens independently.
Only use Agent Teams when:
  • Parallelism provides clear value (multi-module work)
  • Parallel reviews (security + code quality)
For sequential tasks, use subagents instead:
  • /plan → planner agent (single context)
  • /code-review → code-reviewer agent (single context)

Cost Monitoring

Check Current Usage

/cost
Shows token consumption for current session.

Track Over Time

Monitor your Claude Code dashboard:
  • Daily usage trends
  • Per-project costs
  • Model distribution (Opus vs Sonnet)
Target: 80%+ Sonnet usage, <20% Opus usage for optimal cost/performance.

Strategic Compaction Skill

ECC includes a strategic-compact skill that suggests /compact at logical breakpoints. See skills/strategic-compact/SKILL.md for the full decision guide.

Compaction Decision Tree

Completed research/exploration?
  → YES: /compact (clear research context)
  → NO: Continue

Milestone complete (feature done, tests pass)?
  → YES: /compact (clear before next feature)
  → NO: Continue

Debugging complete?
  → YES: /compact (clear investigation context)
  → NO: Continue

Failed approach, trying new direction?
  → YES: /compact (clear failed attempt)
  → NO: Continue

Mid-implementation?
  → NO COMPACTION (preserve working context)

Summary: Optimal Settings

{
  "model": "sonnet",
  "env": {
    "MAX_THINKING_TOKENS": "10000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

Cost Reduction Checklist

  • Default model set to sonnet
  • MAX_THINKING_TOKENS reduced to 10000
  • Auto-compact threshold at 50%
  • Subagent model set to haiku
  • Unused MCPs disabled per project
  • Total MCPs under 10
  • Total tools under 80
  • Using /clear between unrelated tasks
  • Using /compact at logical breakpoints
  • Using /cost to monitor spending
  • Opus usage <20% of total
Expected Savings: 60-70% cost reduction with these optimizations applied.