If you've used the same AI model say - Claude Sonnet - in both Claude Code and Cursor, you've noticed something strange. The exact same model behaves differently. It's more helpful in one, more confused in the other. It nails your codebase conventions in one tool and completely ignores them in another.
That's not magic, and it's not the model's fault. It's context engineering. Or more precisely, the difference between good context engineering and none at all.
Most developers hear "context" and think about prompt engineering — choosing the right words, phrasing instructions clearly, maybe adding a few examples. That matters, but it's a small piece of a much bigger puzzle. Context engineering is about everything the model sees when it generates a response: your project structure, your documentation, your configuration files, your conversation history, the files your tool retrieves. All of it. And most of it is shaped by how your repo is organized, not by how clever your prompt is.
This post is about the repo side of that equation — the practical, structural changes you can make so that AI coding assistants actually understand your codebase instead of guessing at it.
Let's get the big idea out of the way first.
Prompt engineering is about how you phrase a request. Context engineering is about what the model knows when it receives your request. The first is a sentence. The second is an entire information architecture.
Here's a useful mental model. Imagine you're in a meeting with a brilliant consultant who has no memory outside of a single whiteboard. Whatever's on the board is all they know. They can't remember what you discussed yesterday. They can't look things up. They see what's written, and they reason from that.
That whiteboard is the context window. And your job — whether you realize it or not — is to decide what goes on it.
When you ask an AI coding assistant to refactor a function, the model doesn't "know" your codebase. It knows whatever your tool managed to stuff into the context window in time: maybe the current file, maybe some retrieved files, maybe your CLAUDE.md or AGENTS.md. If those inputs are messy, incomplete, or irrelevant, the output will be too. No amount of prompt cleverness will save you if the model is reasoning from bad context.
This is why two developers using the exact same tool on the exact same model can get wildly different results. One has a repo optimized for AI. The other doesn't. The model is identical — the context is not.
This is the mental shift most people miss.
A context window isn't just a container you fill up. It's a finite budget where every token competes for the model's attention. Research consistently shows a U-shaped attention curve: models attend most to information at the very beginning and very end of the context, and tend to lose focus on what's in the middle. Performance also degrades as the total input length increases — even on models that technically support 128K or 200K tokens.
This means dumping your entire repo into the context isn't a strategy. It's anti-strategy. More context isn't better. Relevant context is better.
Your goal as a developer isn't to maximize what the model sees. It's to maximize the signal-to-noise ratio of what the model sees. Everything you put in the context window should earn its spot.
This has direct implications for how you structure your repo. If your project is well-organized, with clear naming, focused files, and good documentation, your AI tools can retrieve the right context efficiently. If your project is a tangled monolith with vague file names and no documentation, the tool spends its token budget on noise — and the model's output suffers accordingly.
Before you can optimize your repo, it helps to understand what your AI tools are actually doing under the hood. Here's the typical flow when you ask an agent like Claude Code, Cursor, or Copilot to make a change:
System prompt loads. This includes the tool's built-in instructions plus any configuration files you've set up (like CLAUDE.md or .cursorrules). This goes in first and eats into your token budget before you've said a word.
Your request arrives. The model sees your message — plus whatever conversation history is still in the window.
The tool retrieves context. Depending on the tool, this might be semantic search, file tree exploration, grep-style keyword matching, or reading files you explicitly referenced. The tool decides what to pull in.
The model reasons and acts. It generates code, runs commands, reads output — all within the remaining token budget.
Every one of these stages is shaped by your repo. The system prompt reads your configuration files. The retrieval step searches your file structure and documentation. The model's ability to understand what it reads depends on how well your code is organized, named, and documented.
So let's talk about what to actually change.
This is the highest-leverage change you can make. Your CLAUDE.md or AGENTS.md file is the first thing the AI reads every session. It's loaded into the context window before anything else. Think of it as the "ground rules" written at the top of the whiteboard.
Most people either skip this file entirely or turn it into a 500-line manifesto. Both are wrong.
Research from HumanLayer and others suggests keeping your configuration file under 300 lines — ideally much less. The reason is practical: frontier LLMs can reliably follow roughly 150-200 instructions. The tool's own system prompt already consumes around 50 of those. That doesn't leave as much headroom as you'd think.
Every line in your configuration file should answer one question: Is this something the agent needs to know for virtually every task? If the answer is "only sometimes," it doesn't belong in the root config. It belongs somewhere the agent can find it when it needs it.
Focus on four categories:
Build and test commands. The agent shouldn't have to guess how to run your project.
## Commands
- `npm run dev` — Start dev server (port 3000)
- `npm run test` — Run Jest tests
- `npm run lint` — ESLint checkHigh-level architecture. A map, not a novel. Enough for the agent to know where to look.
## Architecture
- `/app` — Next.js App Router pages and layouts
- `/components/ui` — Reusable UI components
- `/lib` — Utilities and shared logic
- `/prisma` — Database schema and migrationsNon-obvious conventions. Things the agent couldn't infer from reading your code.
## Conventions
- TypeScript strict mode, no `any` types
- Named exports only, never default exports
- Product images stored in Cloudinary, not locallyLandmines. Things that will break the project if the agent touches them.
## Critical Warnings
- NEVER modify migration files directly
- The Stripe webhook handler MUST validate signatures
- Do not commit .env files under any circumstancesThat's it. Everything else should live in separate documentation files that the agent can pull in when relevant.
This is the principle that makes context engineering work at scale. Instead of cramming every instruction into your root config, keep task-specific documentation in separate markdown files with descriptive names:
agent_docs/
├── building_the_project.md
├── running_tests.md
├── code_conventions.md
├── service_architecture.md
├── database_schema.md
└── authentication_flow.mdThen reference their existence in your root config:
## Reference Documents
For detailed context on specific topics, read the relevant file
in the `agent_docs/` directory before beginning work.The agent sees the root config on every task. It only pulls in authentication_flow.md when working on auth. This keeps the context window clean and the signal high — which is exactly what context engineering is about.
Here's something most developers don't think about: when an AI coding assistant searches your codebase, it's doing text-based navigation. It runs tools like ls, cat, grep, and semantic search to figure out which files are relevant to your request. It's at the mercy of your file names, folder structure, and code organization.
If your codebase is well-structured, the agent finds what it needs quickly and fills the context window with signal. If it's not, the agent wanders, fills the window with irrelevant files, and generates worse output.
You know what utils.js contains because you wrote it. The AI doesn't. It sees a file name that could contain anything, and it either skips it (missing important context) or reads the whole thing (wasting tokens).
Specific, descriptive file names cost you nothing and give the agent a much better chance of finding what it needs:
utils.js → string-formatters.js, date-helpers.js, api-retry-logic.jshelpers/ → auth-helpers/, payment-helpers/, email-helpers/types.ts → user-types.ts, order-types.ts, api-response-types.tsThis same principle applies to function names, variable names, and folder structure. The more semantic information is encoded in names, the better the agent's text-based search works — and the less time it spends exploring irrelevant code.
Large files are a context engineering problem. When an AI tool retrieves a 2,000-line file because one function in it is relevant, 95% of those tokens are noise competing for the model's attention.
There's no perfect line count, but a practical heuristic: if a file contains logic that serves multiple unrelated purposes, it should probably be split. Not for ideological reasons, but because it makes retrieval dramatically more efficient for your AI tools.
Most codebases assume that the person navigating them has institutional knowledge — they know which service talks to which, where the auth layer lives, how data flows from the API to the frontend. AI agents have none of that.
A lightweight ARCHITECTURE.md at the root of your project pays enormous dividends:
# Architecture
## Overview
E-commerce platform with a Next.js frontend, Express API,
and PostgreSQL database. Auth handled by NextAuth with
Google and email providers.
## Data Flow
1. User actions hit Next.js API routes in `/app/api`
2. API routes call service layer in `/lib/services`
3. Service layer uses Prisma ORM to query PostgreSQL
4. Background jobs processed by Bull queue in `/workers`
## Key Boundaries
- Frontend never accesses the database directly
- All payment logic isolated in `/lib/services/payments`
- Email sending goes through `/lib/services/notifications`This file acts as the agent's mental map. When it needs to make a change, it doesn't have to explore blindly — it knows the territory.
Here's an uncomfortable truth: most code documentation is written for humans who already have context. We leave things implied. We use vague references ("see the usual setup"). We write READMEs that explain what the project does but not how it works internally.
AI agents can't fill in the gaps. They reason from exactly what they read. Documentation that serves both humans and AI follows a few principles.
Humans infer that UserService probably calls UserRepository. AI doesn't infer — it retrieves. If your documentation explicitly states these relationships, the agent knows which files to pull in when working on user-related features:
## User Service (`/lib/services/user-service.ts`)
- Depends on: UserRepository, EmailService, AuthProvider
- Called by: API routes in `/app/api/users/`
- Emits events: `user.created`, `user.updated`Every codebase has decisions that look wrong to a newcomer but are correct for reasons lost to history. These are the exact things AI agents will "fix" unless you tell them not to:
## Why We Use Custom Retry Logic (Not Axios Retry)
The standard retry library doesn't support our exponential
backoff with jitter requirement for the payment gateway.
See: `/lib/api-retry-logic.js`
Do not replace with a library without discussing with the team.For code that's genuinely tricky — complex algorithms, workarounds for third-party bugs, performance-sensitive sections — inline comments matter more than you might think. They end up directly in the context window when the agent reads that file, and they can prevent the agent from "improving" code that's intentionally written a certain way:
// IMPORTANT: This timeout is intentionally 5000ms, not 3000ms.
// The payment gateway occasionally takes 4+ seconds on
// first-of-day transactions. Do not reduce this value.
const PAYMENT_TIMEOUT = 5000;Even with a perfectly structured repo, long sessions with AI coding agents create a different problem: context rot. As the conversation history grows, it crowds out the instructions and configuration that were loaded at the start. The model starts "forgetting" your conventions, reverting to bad patterns, or losing track of the task.
This isn't a bug in the model. It's the attention budget running low. Here's how to manage it.
Most AI coding tools have a way to reset the conversation. In Claude Code, it's /clear. In Cursor, you can start a new composer. Use it aggressively between unrelated tasks. A fresh context means the agent starts with full attention on your config files and the current task, not the residue of three previous ones.
If your tool supports sub-agents or task delegation, use them. The idea is simple: instead of one long conversation doing everything, you delegate specific subtasks to separate agent instances that each get their own clean context. The code review happens in one context. The implementation happens in another. Neither pollutes the other.
This is the prompt engineering part of context engineering. When you ask the agent to do something, be specific about what files are relevant and what the scope is. "Refactor the auth module" is vague. "Refactor the verifyToken function in /lib/auth/token-service.ts to use async/await instead of callbacks" is precise. The agent spends less time exploring and more time executing.
Here's a practical summary you can refer back to. You don't need to do all of this at once — start with whatever is most broken in your current setup and work outward.
Configuration files: Root config under 300 lines, high-signal only, with task-specific docs in a separate folder using progressive disclosure.
File structure: Descriptive names (no generic utils.js), focused files (not 2,000-line monoliths), clear folder hierarchy that encodes architectural boundaries.
Documentation: ARCHITECTURE.md at the root with an overview and data flow. Dependency relationships documented explicitly. Non-obvious decisions explained where the agent will find them.
Code quality for AI: Semantic naming throughout (files, functions, variables). Inline comments on critical/tricky code. Clear module boundaries that map to retrieval boundaries.
Session management: Clear context between tasks. Use sub-agents for complex multi-step work. Write focused, scoped prompts instead of vague requests.
Here's what changes when you treat your repo as context for AI, not just code for humans.
The agent stops guessing and starts navigating. It finds the right files faster because your names and structure give it clear signals. It follows your conventions because your config file spells them out in terms it can actually attend to. It doesn't break things because your documentation tells it where the landmines are. And it delivers better output because the context window is filled with signal instead of noise.
Context engineering isn't a one-time setup. It's an ongoing practice — just like code quality itself. Your repo changes, your team learns new patterns, your tools evolve. The best developers I've seen treat their agent configuration and documentation the same way they treat their test suite: as a living system that gets maintained alongside the code.
Start with your CLAUDE.md. Fix your file names. Write an ARCHITECTURE.md. Then watch how differently your AI coding assistant behaves — same model, radically better context, completely different results.
Got a context engineering trick that's working for you? A repo structure pattern that made your AI tools noticeably better? I'd love to hear about it.

Hi, I am Full Stack Developer. I am passionate about JavaScript, and find myself working on a lot of React based projects.