Context Is the Spec: Planning and Defining 'Good' for AI-Assisted Development

One of my infrastructure engineers told me a change would take five days.

It was a new service deployment in a twelve-year-old Terraform codebase that had been through multiple major versions, three or four repos, two migrations, and roughly seventy to eighty production services. The kind of codebase where the bodies are buried in places only two or three senior engineers know about. He wasn't wrong about the five days -- without the right context, any change in that repo is complicated.

I asked him to spend one day building a context map instead. A document that would explain where things live, what the acronyms mean, how our system differs from how the rest of the internet does things, and what a known-good recent change looks like as an example to follow.

He thought I was wasting a day.

Then I took that context map -- and I have never written Terraform in my life -- and spent forty minutes with an AI to generate what turned out to be roughly eighty percent of the correct answer. My senior engineer reviewed it, said "this is maybe a B-minus, couple things to fix," took it to production in another hour.

Total: two days instead of five. Including the day building the context map. And that context map? It checked into the repo. Every engineer can use it now. Every task in that area now costs forty minutes instead of five days.

The secret wasn't the AI. The secret was the spec.

The Intern on Perpetual Day One

Here's the mental model I keep coming back to when thinking about working with AI: it's a really smart intern who is permanently on day one.

Not incompetent. Not a bad hire. Genuinely capable, fast, eager, and knows a lot of things. But it doesn't know your codebase. It doesn't know your team's definition of "good." It doesn't know that DG is a thing you invented seven years ago and means something specific to your infrastructure. It knows what the average of the internet says is good -- which is quite different from what is good in your system.

Who in the room knows how to onboard that intern effectively? The people who've been doing it for years. The people who can say, "Here's the problem, here's what good looks like, here's an example, here are the constraints, let me know before you do something I'll have to undo." That is spec-based management. It works for humans. It works for AI. The interface is slightly different -- you're typing into a chat window instead of talking over coffee -- but the skill is the same.

The difference between giving an intern vague direction and giving them a proper brief is the difference between getting slop back and getting something you can actually use. The only thing that changed is the speed of the feedback cycle. With a human, you might wait a week to find out your brief was too vague. With AI, you find out in thirty seconds.

What a Good Spec Actually Looks Like

The Terraform context map worked because it answered four questions that the AI couldn't answer on its own:

What is the system? Not the generic definition -- your specific system. What names do you use for things, how do your components relate, where do the non-obvious dependencies live. The AI has to understand what it's working in before it can make good decisions about it.

What does good look like? Show it a working example. Not a hypothetical, not a description of one -- the actual thing. "Here's a change we made three months ago that did the same kind of thing. Do it like that."

What are the constraints? If you don't tell it what not to do, it's going to YOLO something out there. It will install a library you don't want, restructure something you didn't ask it to touch, or make a naming decision that breaks your conventions. Explicit constraints aren't optional. "Don't change anything outside of this directory. Don't create new modules. If you're unsure about a naming convention, ask me before assuming."

What is the success condition? What will you check to know whether it worked? This is the test before the test. If you can write down how you'll verify the outcome, the AI can orient its output toward that target instead of toward what it guesses you want.

This sounds like writing a ticket. It is writing a ticket. A real one, not a two-sentence one. The investment in writing a solid spec before a coding session -- whether you're working with an AI or a new hire -- pays off immediately and compounds over time if that spec lives in the repo.

Spec-based agentic development workflow

Spec-Based Agentic Development

I started calling this "spec-based agentic development" with my teams, which is admittedly not the catchiest phrase I've ever coined. But the name is intentional -- specifying before implementing isn't a nice-to-have, it's the whole practice.

The workflow looks like this: write a spec, have the AI review and ask clarifying questions before it writes a single line of code, then implement against the spec with the AI tracking progress like a checklist. Just like you would track Jira work items, except the velocity is much faster and the reviewable unit is the spec, not the commit.

One rule I'm firm about: a coding session is not done until the build compiles and all tests pass. Not "I'll clean it up later." Not "it's probably fine." The AI doesn't get to declare victory. The spec and the test results declare victory together.

This also applies in the other direction -- if the AI is about to make a significant decision, I ask it to explain what it's going to do before it does it. Just like I'd ask a junior engineer to walk me through their approach before they spend three days going down a rabbit hole. "Explain the change you're about to make. Then make it." The explanation costs thirty seconds. Undoing an hour of wrong work costs a lot more.

Context Is Reusable

Here's the part that convinced my skeptical team: the context map doesn't expire.

That day we spent building it wasn't just five days saved on one task. It was the overhead amortized across every task that followed. The newer engineers who had never been in that part of the codebase can now work in it in forty minutes. The context lives in the repo, gets updated when things change, and gets better over time.

The same principle applies anywhere you're working in a complex system with AI. The time you spend building documentation, writing specs, and codifying what "good" looks like in your specific system is not overhead. It's the highest-return engineering investment you can make right now, because it compounds faster than it ever did when the only people using it were humans.

If you're trying to figure out where to start with AI-assisted development and your codebase has no context documentation, start there. Pick the area that's most expensive to onboard people into. Write the map. Spend a day. The next time you need to touch that code, you'll spend forty minutes instead of five days. And the time after that. And the time after that.

This is post 2 of 7 in The Boring Parts Matter: Engineering Fundamentals for the Agentic Era.

Context Is the Spec: Planning and Defining 'Good' for AI-Assisted Development

The Intern on Perpetual Day One

What a Good Spec Actually Looks Like

Spec-Based Agentic Development

Context Is Reusable

Related Posts

An Unreliable Test Is Worse Than No Test at All

Designing for Agents: Architecture That Survives High-Velocity Change