Context Engineering for AI-Assisted Development
In large codebases, AI spends more tokens searching than building.
Plan Stack turns codebase knowledge into reusable plans —
so every task starts with context, not exploration.
A zero-infrastructure cache for LLM codebase exploration.
No install required — just a
docs/ folder and one line in CLAUDE.md.
Proven with a 10-engineer team on a 500-table Rails app
The Problem
The real bottleneck: search cost
In a 500-table application, every AI task begins with exploration — ls, grep, file reading — burning thousands of tokens before writing a single line of code. The codebase search pollutes the context window, leaving less room for actual implementation.
A context window is not storage.
It is cognitive load.
Stuffing 195K tokens into a 200K window leaves no room for
reasoning.
The Solution
Three principles of context engineering
Stop treating context as infinite storage. Start engineering it like you engineer code.
Isolation
Provide the minimum effective context for each task. Scope by responsibility, not file size.
OAuth2 models + relevant controller
Chaining
Break work into stages. Pass artifacts between them, not entire conversation histories.
not conversation history (30K tokens)
Headroom
Never operate at 100% capacity. Reserve space for the model to actually think.
Leave room for reasoning
The Insight
Codebases follow a Pareto distribution
Commit history reveals a power law: a small fraction of files account for the majority of changes. You don't need to search everything every time — distill knowledge of frequently-touched areas into plans, and search cost drops dramatically.
500 files scanned → ~15,000 tokens
1 plan read → ~300 tokens
Introducing
Plan Stack
Implementation plans as first-class artifacts
Instead of letting research and decisions disappear with each
/clear, Plan Stack captures them in lightweight, reusable plans.
A 50-file investigation becomes a 300-line plan. Six months later, reviewing one plan beats re-reading 50 files and re-discovering architectural intent.
- + Compressed research for AI context
- + Long-term memory for humans
- + A reliable starting point after context reset
- + No infrastructure needed — Git-versioned, readable by both AI and humans
LLM-Readable Guides
Plans become implementation guides that Claude Code reads and follows. No need to re-explain your architecture every session.
not: AI greps 500 files blindly
Knowledge Inheritance
Similar modifications? The AI finds past plans with design decisions already documented. No human explanation needed.
not: rediscover patterns each time
Why docs/, not RAG?
Structured markdown is inherently searchable. Git-versioned naturally. No vector database, no embedding pipeline. Both AI and humans read the same source of truth.
not: RAG pipeline → embeddings → retrieval → overhead
The Workflow
Research, Plan, Execute, Review
Each phase applies context engineering principles. The workflow creates a self-reinforcing loop where knowledge compounds.
Research
AI checks
docs/plans/
for similar implementations. Plan exists? Target files
identified in hundreds of tokens. No plan? Agentic search
runs — results get committed as a new plan.
Plan
AI generates structured plan, human reviews before code. Catches intent misalignment before tokens are spent on wrong implementation.
HeadroomExecute
Plan carries intent across context resets.
/clear
and resume — no re-explanation needed, no context re-discovery.
Review
Plan in PR enables both AI and human review. Detect drift between intended design and actual implementation.
Isolation + ChainingThe Pattern
Embrace the reset
Context degradation is inevitable. Plan Stack turns
/clear
from a loss into a feature.
Restart at 0% context without restarting your work.
Get Started
One line to begin
Add this instruction to your
CLAUDE.md:
Search docs/plans/ for similar past implementations before planning.
This single line creates the self-reinforcing loop:
-
1. AI checks
docs/plans/first (hundreds of tokens) - 2. Plan found → target files identified immediately
- 3. No plan → agentic search runs on codebase
- 4. Search results committed as new plan → knowledge base grows
Who Benefits
Built for every role in your team
Plan Stack isn't just an AI optimization. It improves workflows for everyone involved in software development.
AI-Friendly
Reduced search tokens, prevention of context pollution, explicit intent. AI spends tokens implementing, not exploring.
not: 15,000 tokens to grep codebase
Reviewer-Friendly
Plan in every PR. Reviewer — human or AI — knows the intended design before reading code.
not: guess intent from code changes
Team-Friendly
Knowledge accumulates and inherits. New members read plans to understand past decisions. Onboarding accelerates.
not: ask 10 people → different answers
Learn
Master context engineering
Free courses from beginner to advanced. Each lesson is 1-2 minutes, designed for engineers who value their time.
Stop searching the same codebase every time
Start accumulating knowledge. Plans compound with every commit — from your first plan to your thousandth.