Claude Code Context Management

Q: What is /compact in Claude Code?

/compact compresses the current conversation history into a summary, freeing up context window space without losing the thread of the current task. Use it when Claude warns you the context is filling up and you want to continue the same task. /compact preserves what Claude knows about the current task but loses the detailed back-and-forth history. It is the right choice mid-task when you're not done yet.

Q: What is the difference between /compact and /clear in Claude Code?

/compact summarizes the conversation history into a compressed form, preserving task continuity while freeing context space. /clear resets the conversation completely, starting fresh with an empty context. Use /compact when you're mid-task and want to continue. Use /clear when you've finished a task and are starting something new. /clear is cheaper (empty context = fewer tokens), /compact preserves continuity at the cost of a summary.

Q: How large is Claude Code's context window?

Claude Code uses Claude's full 200K token context window (approximately 150,000 words or 500 pages of text). In practice, a typical Claude Code session reading a medium-sized codebase (30-50 files) with back-and-forth conversation uses 50K-100K tokens before hitting the compaction threshold. Very large codebases or long sessions may require /compact or splitting across multiple sessions.

Q: How do I control Claude Code token costs?

The main levers for controlling Claude Code token costs: (1) Use /clear between tasks — empty context costs nothing. (2) Be specific about which files to read — 'read only src/auth/' instead of 'read the codebase'. (3) Use /compact instead of continuing a long conversation — summarization is cheaper than full history. (4) Use CLAUDE.md to give Claude standing context so it doesn't re-read architecture files every session. (5) Use the Haiku model for simple tasks (renames, formatting) and Sonnet/Opus for complex reasoning.

Q: How do I prime Claude Code with context for a new session?

For a new session on a codebase Claude hasn't seen: (1) Start with a CLAUDE.md that describes the architecture and dev commands — Claude reads it automatically. (2) Tell Claude the specific area you're working in: 'We're working on the authentication service today — read src/auth/ and src/middleware/auth.ts.' (3) Give Claude the current state: 'The last session added JWT refresh tokens — the new code is in AuthService.refreshToken(). Today we're adding rate limiting.' Targeted context priming is faster and cheaper than asking Claude to read everything.

Updated May 2026 · /compact, /clear, and token efficiency

Context management is the skill that separates effective Claude Code users from frustrated ones. Claude Code has a 200K token context window — large, but not infinite. Long sessions, large codebases, and repeated file reads fill it up. Knowing when to compact, when to clear, and how to prime context efficiently makes every session faster and cheaper.

/compact vs /clear — When to Use Each

Command	What it does	When to use
`/compact`	Summarizes conversation history, frees context space while preserving task continuity	Mid-task when context is filling up and you're not done yet
`/clear`	Resets the conversation completely — empty context, fresh start	Between tasks, or when starting something unrelated to the current session

The key distinction: /compact preserves the thread of what you're working on. /clear throws it away. Use /compact when you need continuity; use /clear when a new task is genuinely unrelated.

/clear is cheaper than /compact. An empty context costs nothing on the next turn. /compact generates a summary, which costs tokens on both the compaction turn and subsequent turns (since the summary is included in every subsequent prompt). For cost-sensitive workflows, prefer /clear at task boundaries over letting the context fill and compacting.

Understanding the 200K Context Window

Claude Code's full 200K token window is roughly equivalent to:

~500 pages of text, or ~150,000 words
A medium-sized Node.js or Python app (30–50 files) read in full
2–4 hours of typical back-and-forth conversation with moderate file reading

Context fills from both directions: the files Claude reads expand the context from the bottom, and the growing conversation history expands it from the top. When Claude warns that the context is filling up, you're typically at 70–80% capacity.

Priming Context for a New Session

Starting a new session efficiently means giving Claude the minimum context it needs to work, not everything it could possibly want. Over-specification wastes tokens; under-specification causes Claude to ask clarifying questions or make wrong assumptions.

The right level of context priming

Efficient session start

We're working on the authentication service today. Read src/auth/ and src/middleware/auth.ts. Context: this is a Next.js + Prisma app. The last session added JWT refresh tokens — the new code is in AuthService.refreshToken(). Today we're adding rate limiting to the login endpoint.

This prompt reads 6 files instead of the whole codebase, gives Claude the "last session" summary it would have gotten from conversation history, and scopes the task. It's faster, cheaper, and gives Claude exactly what it needs.

What NOT to prime

Inefficient session start (avoid)

Read the entire codebase and tell me about the architecture.

This reads everything (expensive), produces a summary you'll have to scroll through (time-consuming), and doesn't scope Claude toward anything useful. Use your CLAUDE.md for architecture context and reserve session priming for the specific area you're working in today.

CLAUDE.md as Persistent Context

The highest-leverage context management tool is a well-written CLAUDE.md. Claude reads it automatically at the start of every session, which means you don't need to re-explain architecture, conventions, or dev commands in every session prompt.

Things that belong in CLAUDE.md (read once, applied every session):

Project overview and tech stack
Dev commands (make dev, make test, etc.)
Key architectural decisions and constraints
Coding conventions and forbidden patterns

Things that do NOT belong in CLAUDE.md (put in session priming instead):

Current task or sprint goals
Recent changes from the last session
Which specific files to read today

Managing Context in Long Sessions

Signs you're approaching the context limit

Claude warns you the context is filling up
Claude starts "forgetting" earlier decisions in the conversation
Responses become slower (larger contexts take longer to process)
Claude asks clarifying questions about things you established earlier

What to do

Use /compact if you're mid-task and want to continue
Summarize manually before /clear — write a one-paragraph summary of where you left off, then /clear and paste the summary at the start of the new session
Split the task — finish what you're on, commit, then start a fresh session for the next piece

Token Cost Levers

Every token Claude reads and writes costs money. The main dials:

Action	Token impact	Best practice
Reading a 500-line file	~700 tokens/read, stays in context	Read only files relevant to the task
Long conversation history	Included in every subsequent turn	Use /clear at task boundaries
/compact	Replaces history with a ~500-token summary	Use mid-task when context is filling
/clear	Context → 0 (CLAUDE.md still read)	Use between unrelated tasks
CLAUDE.md	Read once per session (~500–2000 tokens)	Keep concise; remove stale content
Model choice	Haiku is 25× cheaper than Opus	Use Haiku for simple tasks (renames, formatting)

Estimate your session cost. Use the Claude Cost Calculator to estimate token costs before starting large sessions. A typical 2-hour Claude Code session with moderate file reading (50 files, 100 turns) costs $0.15–$0.60 with Claude Sonnet. Long sessions reading large codebases can reach $1–$3 without context management.

Context Management for Large Codebases

For monorepos or large codebases where reading everything is impractical:

Scoped reading

Prompt

We're adding a new API endpoint today. Read only: src/routes/ (route definitions), src/middleware/ (auth/validation middleware), and src/services/UserService.ts (the service we'll call). Don't read anything else unless you need it.

Progressive reading

Ask Claude to read files as needed, not upfront: "Start with the route handler. If you need to read other files to understand it, read them as you encounter them." Claude will pull files lazily rather than loading everything at once.

Session handoff

For multi-session work, end each session with a handoff summary:

End-of-session handoff prompt

Before I close this session: write a one-paragraph summary of what we did, what you changed, what's left to do, and any decisions we made that future-me needs to know. I'll paste this at the start of the next session.

FAQ

What is /compact in Claude Code?

/compact compresses the current conversation history into a summary, freeing up context window space without losing the thread of the current task. Use it when Claude warns the context is filling up and you want to continue the same task. It's the right choice mid-task when you're not done yet.

What is the difference between /compact and /clear in Claude Code?

/compact summarizes conversation history into a compressed form, preserving task continuity while freeing context space. /clear resets the conversation completely, starting fresh with an empty context. Use /compact when you're mid-task; use /clear when you've finished a task and are starting something new.

How large is Claude Code's context window?

Claude Code uses Claude's full 200K token context window — approximately 150,000 words or 500 pages of text. A typical session reading a medium-sized codebase (30–50 files) with back-and-forth conversation uses 50K–100K tokens before hitting the compaction threshold.

How do I control Claude Code token costs?

The main levers: (1) Use /clear between tasks — empty context costs nothing. (2) Be specific about which files to read. (3) Use /compact instead of continuing a very long conversation. (4) Use CLAUDE.md for standing context so you don't re-explain architecture every session. (5) Use Haiku for simple tasks and Sonnet/Opus for complex reasoning.

How do I prime Claude Code with context for a new session?

Start with a CLAUDE.md that describes the architecture (Claude reads it automatically). Then give Claude targeted session context: which files to read, what the last session did, and what today's task is. Targeted priming is faster and cheaper than asking Claude to read everything.