AI-Native Workflow

Context Engineering for AI-Assisted Development

In large codebases, AI spends more tokens searching than building.
Plan Stack turns codebase knowledge into reusable plans — so every task starts with context, not exploration.

A zero-infrastructure cache for LLM codebase exploration.
No install required — just a docs/ folder and one line in CLAUDE.md.

Proven with a 10-engineer team on a 500-table Rails app

Get Started View on GitHub

The Problem

The real bottleneck: search cost

In a 500-table application, every AI task begins with exploration — ls, grep, file reading — burning thousands of tokens before writing a single line of code. The codebase search pollutes the context window, leaving less room for actual implementation.

62x

Context Growth

From 32K to 2M tokens in two years. Yet quality plateaued.

~70%

Search Overhead

In large apps, agentic exploration dominates token usage before any code is generated.

Lost

In the Middle

Information in the middle of long contexts gets systematically ignored.

A context window is not storage. It is cognitive load.
Stuffing 195K tokens into a 200K window leaves no room for reasoning.

The Solution

Three principles of context engineering

Stop treating context as infinite storage. Start engineering it like you engineer code.

Isolation

Provide the minimum effective context for each task. Scope by responsibility, not file size.

OAuth2 + billing + CSS + tests
OAuth2 models + relevant controller

Chaining

Break work into stages. Pass artifacts between them, not entire conversation histories.

Plan artifact (300 lines)
not conversation history (30K tokens)

Headroom

Never operate at 100% capacity. Reserve space for the model to actually think.

Token limit = input + output
Leave room for reasoning

The Insight

Codebases follow a Pareto distribution

Commit history reveals a power law: a small fraction of files account for the majority of changes. You don't need to search everything every time — distill knowledge of frequently-touched areas into plans, and search cost drops dramatically.

! Without Plan Stack

Every task → full codebase search
500 files scanned → ~15,000 tokens

→

+ With Plan Stack

Every task → check plans first
1 plan read → ~300 tokens

Introducing

Plan Stack

Implementation plans as first-class artifacts

Instead of letting research and decisions disappear with each /clear, Plan Stack captures them in lightweight, reusable plans.

A 50-file investigation becomes a 300-line plan. Six months later, reviewing one plan beats re-reading 50 files and re-discovering architectural intent.

+ Compressed research for AI context
+ Long-term memory for humans
+ A reliable starting point after context reset
+ No infrastructure needed — Git-versioned, readable by both AI and humans

LLM-Readable Guides

Plans become implementation guides that Claude Code reads and follows. No need to re-explain your architecture every session.

AI reads plan → understands intent
not: AI greps 500 files blindly

Knowledge Inheritance

Similar modifications? The AI finds past plans with design decisions already documented. No human explanation needed.

"Add OAuth" → finds OAuth plan
not: rediscover patterns each time

Why docs/, not RAG?

Structured markdown is inherently searchable. Git-versioned naturally. No vector database, no embedding pipeline. Both AI and humans read the same source of truth.

docs/plans/ → versioned, reviewable, zero infra
not: RAG pipeline → embeddings → retrieval → overhead

The Workflow

Research, Plan, Execute, Review

Each phase applies context engineering principles. The workflow creates a self-reinforcing loop where knowledge compounds.

Research

AI checks docs/plans/ for similar implementations. Plan exists? Target files identified in hundreds of tokens. No plan? Agentic search runs — results get committed as a new plan.

Isolation

Plan

AI generates structured plan, human reviews before code. Catches intent misalignment before tokens are spent on wrong implementation.

Headroom

Execute

Plan carries intent across context resets. /clear and resume — no re-explanation needed, no context re-discovery.

Chaining

Review

Plan in PR enables both AI and human review. Detect drift between intended design and actual implementation.

Isolation + Chaining

The Pattern

Embrace the reset

Context degradation is inevitable. Plan Stack turns /clear from a loss into a feature.

! Before /clear

95% context used, quality degrading

+ Resume from plan

15% context, full fidelity restored

Restart at 0% context without restarting your work.

Get Started

One line to begin

Add this instruction to your CLAUDE.md:

CLAUDE.md

Search docs/plans/ for similar past implementations before planning.

This single line creates the self-reinforcing loop:

1. AI checks docs/plans/ first (hundreds of tokens)
2. Plan found → target files identified immediately
3. No plan → agentic search runs on codebase
4. Search results committed as new plan → knowledge base grows

Who Benefits

Built for every role in your team

Plan Stack isn't just an AI optimization. It improves workflows for everyone involved in software development.

AI-Friendly

Reduced search tokens, prevention of context pollution, explicit intent. AI spends tokens implementing, not exploring.

300 tokens to load plan
not: 15,000 tokens to grep codebase

Reviewer-Friendly

Plan in every PR. Reviewer — human or AI — knows the intended design before reading code.

Plan + diff = clear intent
not: guess intent from code changes

Team-Friendly

Knowledge accumulates and inherits. New members read plans to understand past decisions. Onboarding accelerates.

Read 10 plans → understand the system
not: ask 10 people → different answers

Learn

Master context engineering

Free courses from beginner to advanced. Each lesson is 1-2 minutes, designed for engineers who value their time.

Quick Start

Stop searching the same codebase every time

Start accumulating knowledge. Plans compound with every commit — from your first plan to your thousandth.

Get Started View on GitHub

Context Engineering for AI-Assisted Development

The real bottleneck: search cost

Three principles of context engineering

Isolation

Chaining

Headroom

Codebases follow a Pareto distribution

Plan Stack

Implementation plans as first-class artifacts

LLM-Readable Guides

Knowledge Inheritance

Why docs/, not RAG?

Research, Plan, Execute, Review

Research

Plan

Execute

Review

Embrace the reset

One line to begin

Built for every role in your team

AI-Friendly

Reviewer-Friendly

Team-Friendly

Master context engineering

Claude Code Intro

Context Design

LLM Basics

Stop searching the same codebase every time