Claude Code Context Management
Context management is the skill that separates effective Claude Code users from frustrated ones. Claude Code has a 200K token context window — large, but not infinite. Long sessions, large codebases, and repeated file reads fill it up. Knowing when to compact, when to clear, and how to prime context efficiently makes every session faster and cheaper.
/compact vs /clear — When to Use Each
| Command | What it does | When to use |
|---|---|---|
/compact | Summarizes conversation history, frees context space while preserving task continuity | Mid-task when context is filling up and you're not done yet |
/clear | Resets the conversation completely — empty context, fresh start | Between tasks, or when starting something unrelated to the current session |
The key distinction: /compact preserves the thread of what you're working on. /clear throws it away. Use /compact when you need continuity; use /clear when a new task is genuinely unrelated.
Understanding the 200K Context Window
Claude Code's full 200K token window is roughly equivalent to:
- ~500 pages of text, or ~150,000 words
- A medium-sized Node.js or Python app (30–50 files) read in full
- 2–4 hours of typical back-and-forth conversation with moderate file reading
Context fills from both directions: the files Claude reads expand the context from the bottom, and the growing conversation history expands it from the top. When Claude warns that the context is filling up, you're typically at 70–80% capacity.
Priming Context for a New Session
Starting a new session efficiently means giving Claude the minimum context it needs to work, not everything it could possibly want. Over-specification wastes tokens; under-specification causes Claude to ask clarifying questions or make wrong assumptions.
The right level of context priming
We're working on the authentication service today. Read src/auth/ and src/middleware/auth.ts. Context: this is a Next.js + Prisma app. The last session added JWT refresh tokens — the new code is in AuthService.refreshToken(). Today we're adding rate limiting to the login endpoint.
This prompt reads 6 files instead of the whole codebase, gives Claude the "last session" summary it would have gotten from conversation history, and scopes the task. It's faster, cheaper, and gives Claude exactly what it needs.
What NOT to prime
Read the entire codebase and tell me about the architecture.
This reads everything (expensive), produces a summary you'll have to scroll through (time-consuming), and doesn't scope Claude toward anything useful. Use your CLAUDE.md for architecture context and reserve session priming for the specific area you're working in today.
CLAUDE.md as Persistent Context
The highest-leverage context management tool is a well-written CLAUDE.md. Claude reads it automatically at the start of every session, which means you don't need to re-explain architecture, conventions, or dev commands in every session prompt.
Things that belong in CLAUDE.md (read once, applied every session):
- Project overview and tech stack
- Dev commands (
make dev,make test, etc.) - Key architectural decisions and constraints
- Coding conventions and forbidden patterns
Things that do NOT belong in CLAUDE.md (put in session priming instead):
- Current task or sprint goals
- Recent changes from the last session
- Which specific files to read today
Managing Context in Long Sessions
Signs you're approaching the context limit
- Claude warns you the context is filling up
- Claude starts "forgetting" earlier decisions in the conversation
- Responses become slower (larger contexts take longer to process)
- Claude asks clarifying questions about things you established earlier
What to do
- Use /compact if you're mid-task and want to continue
- Summarize manually before /clear — write a one-paragraph summary of where you left off, then /clear and paste the summary at the start of the new session
- Split the task — finish what you're on, commit, then start a fresh session for the next piece
Token Cost Levers
Every token Claude reads and writes costs money. The main dials:
| Action | Token impact | Best practice |
|---|---|---|
| Reading a 500-line file | ~700 tokens/read, stays in context | Read only files relevant to the task |
| Long conversation history | Included in every subsequent turn | Use /clear at task boundaries |
| /compact | Replaces history with a ~500-token summary | Use mid-task when context is filling |
| /clear | Context → 0 (CLAUDE.md still read) | Use between unrelated tasks |
| CLAUDE.md | Read once per session (~500–2000 tokens) | Keep concise; remove stale content |
| Model choice | Haiku is 25× cheaper than Opus | Use Haiku for simple tasks (renames, formatting) |
Context Management for Large Codebases
For monorepos or large codebases where reading everything is impractical:
Scoped reading
We're adding a new API endpoint today. Read only: src/routes/ (route definitions), src/middleware/ (auth/validation middleware), and src/services/UserService.ts (the service we'll call). Don't read anything else unless you need it.
Progressive reading
Ask Claude to read files as needed, not upfront: "Start with the route handler. If you need to read other files to understand it, read them as you encounter them." Claude will pull files lazily rather than loading everything at once.
Session handoff
For multi-session work, end each session with a handoff summary:
Before I close this session: write a one-paragraph summary of what we did, what you changed, what's left to do, and any decisions we made that future-me needs to know. I'll paste this at the start of the next session.
FAQ
What is /compact in Claude Code?
/compact compresses the current conversation history into a summary, freeing up context window space without losing the thread of the current task. Use it when Claude warns the context is filling up and you want to continue the same task. It's the right choice mid-task when you're not done yet.
What is the difference between /compact and /clear in Claude Code?
/compact summarizes conversation history into a compressed form, preserving task continuity while freeing context space. /clear resets the conversation completely, starting fresh with an empty context. Use /compact when you're mid-task; use /clear when you've finished a task and are starting something new.
How large is Claude Code's context window?
Claude Code uses Claude's full 200K token context window — approximately 150,000 words or 500 pages of text. A typical session reading a medium-sized codebase (30–50 files) with back-and-forth conversation uses 50K–100K tokens before hitting the compaction threshold.
How do I control Claude Code token costs?
The main levers: (1) Use /clear between tasks — empty context costs nothing. (2) Be specific about which files to read. (3) Use /compact instead of continuing a very long conversation. (4) Use CLAUDE.md for standing context so you don't re-explain architecture every session. (5) Use Haiku for simple tasks and Sonnet/Opus for complex reasoning.
How do I prime Claude Code with context for a new session?
Start with a CLAUDE.md that describes the architecture (Claude reads it automatically). Then give Claude targeted session context: which files to read, what the last session did, and what today's task is. Targeted priming is faster and cheaper than asking Claude to read everything.