What makes good documentation (with and without AI Agents)
TL;DR
- Document context that isn't already in the source code
- Design documentation to be consumed by both humans and AI agents
- Living documentation binds code and documentation together through executable tests
- Three proven methods: contextual code comments, minimal steps to runnable code, and decision logs
Documentation is hard but necessary. I frequently find myself in discussions about what should be documented, what's too little, and what's too much. I often advocate for "less is more" and try to document as efficiently as possible. AI and AI agents bring many new opportunities and challenges that must be considered to make documentation AI-agent ready, as these may eventually handle parts of the implementation and need to read and create documentation effectively.
The foundation of the methods I'll present:
- Document what isn't already in the source code (give code context)
- Design your documentation to be consumed by both humans and AI agents
- Living documentation is executable and binds documentation and code together (automated tests) → after changes, correctness can be verified by running the living doc/tests → scalability with high change velocity (e.g., many developers/AI agents)
Here are three methods I've applied over the years that have found broad acceptance:
Support complex or difficult code sections with comments
Describing complex code sections in plain language seems straightforward, but AI can now often do this better and more contextually than you can as the code author. You don't know beforehand what questions a potential reader might have, making many comments irrelevant.
AI helps by providing appropriate answers based on the reader's prompt and the code. However, what AI cannot do is provide context that isn't in the code itself:
- Why must it be implemented this way and not another (business requirements)
- Did a specific client request this requirement, making the code complex at this point? Reference the specific person or team
- Was this code introduced due to a specific bug or ticket? This can be noted in comments (also visible in Git logs, but that's an extra step that AI agents don't directly search while they typically read code lines and comments)
Decision Log
A logbook for major code adjustments, inspired by ADRs in a simplified form at the repository level. Each decision gets an entry with reference number, date, author, involved parties, and description.
This makes decisions transparent. Examples of questions that can be answered through the Decision Log:
- Why was a component encapsulated?
- Why was a database schema change performed?
- Why did the service architecture need rebuilding?
All these questions can be described in the Decision Log. The decision is recorded in version control and reflects the knowledge state of the current version. They can also support code reviews—what you'd typically write in a PR ("Why was this change made in this form?") can be captured more sustainably here, avoiding duplicate effort that disappears in the repository platform (e.g., GitHub).
Decisions can be better understood in hindsight—the question "Why did we decide this way?" is asked less frequently.
Important: AI agents can read the Decision Log and incorporate it into their decisions:
- A central Decision Log in the project makes it easy for AI agents to read, as it's not distributed across code
- However, this can be problematic when specific attention is needed at particular code locations (equally difficult for humans and AI) → then fall back on code comments
- It's actually clear: the closer the information is to the relevant location, the better
- Therefore, use the Decision Log for larger decisions that affect multiple locations in the project
Metric: Steps to runnable code/service
This metric indicates how quickly a new developer can get the code running locally after cloning the repository:
- Successfully starting the API and testing it with an API testing tool like Bruno
- Running unit/integration/whatever tests
External dependencies should be considered as they may be relevant for execution. The README.md content should ideally lead to successfully executable code within minutes (not hours) without errors.
This is crucial for quickly getting to executable code. For example, when a bug is reported and a colleague who usually maintains the service is on vacation: how quickly can I get into the code? This is fastest through locally executable/debuggable code. Generally, when a new colleague arrives and needs to implement their first feature in the service, the faster they get to executable code, the better.
In the age of AI agents, this becomes even more critical. Getting things runnable after an AI agent has changed something is paramount—you need to check compile errors, run tests, examine logs, and potentially pass them back to the AI for correction. Changing a system with AI agents without executing the code is like navigating a ship in dense fog with your instruments turned off—you're not just blind to where you're going, you're blind to whether you're even moving in the right direction or about to hit something catastrophic.