Release Checklist: Enforcing Definition of Done with Flexibility

When software development organizations experience declining productivity, quality, and engineering effectiveness, the diagnosis is predictable: developers aren't doing what's expected of them. This gap between expectation and execution has been codified into what agile practitioners call the "Definition of Done"—a master checklist of everything developers should complete before calling work finished.

The pattern plays out with remarkable consistency. Management convenes workshops to create comprehensive Definition of Done checklists. When developers inevitably fail to adhere to these lists, the solution seems obvious: post the checklist at every desk. The implicit assumptions are revealing—that developers are capable of completing the list, that systems support them in completing it, and that the only problem is that developers don't care enough or simply forget.

There's a grain of truth to the last assumption, but only a grain. This pattern of blaming human operators rather than examining the systems in which they operate is pervasive in software organizations. The Definition of Done cycle repeats predictably, often annually or with each reorganization, as we've witnessed firsthand across dozens of SDLC organizations.

At SystemsWay, we find these Definition of Done initiatives largely useless—not because the items on the list are wrong, but because achieving them is often impossible when the interdependent technological and social systems of software development are poorly designed. When systems are broken, even excellent developers struggle to follow the Definition of Done consistently, let alone those who are less committed.

The lists themselves grow relentlessly longer: code quality standards, code review completion, all acceptance criteria met, unit tests passed, test coverage thresholds achieved, CI pipeline success, end-to-end testing, no critical defects, security checks passed, static analysis clean, technical documentation updated. Each root cause analysis produces the same solution: add another item to the checklist. Developers grow weary of such proliferation.

We've encountered developers who champion these checklists themselves, convinced they represent high standards while their colleagues represent low standards requiring accountability. But in most cases, we find the system itself is broken, making checklist completion unreasonable. We're not suggesting systems shouldn't meet these standards—we're observing that the way systems are configured makes achieving them nearly impossible.

The Enforcement Problem

Most Definitions of Done are enforced by embedding checks in pull request pipelines or post-commit CI/CD processes, implemented as hard gates that break the pipeline or block releases when checks fail. These checks execute linearly in CI/CD, causing build and test times to increase exponentially.

Both approaches are fundamentally flawed. They execute checks when developers have different objectives, turning quality gates into obstacles that delay or pause actual work. The consequences compound. Pipelines break frequently. Many checks produce false positives. Dependencies on third-party systems create availability issues. We're fortunate that many companies lack sufficiently sophisticated technological systems—otherwise they'd add performance tests and every conceivable scan to the pipeline.

This approach becomes catastrophic when every layer of management embraces "shift left"—the ideology of adding more checks earlier in the development lifecycle. The earlier we are in the process, the more we should focus on customer requirements, showing work to product owners, and determining whether solutions will work at all. Often developers simply need to build and deploy to a test server. But CI/CD gates become hardcoded obstacles in the name of shift-left ideology.

Getting Definition of Done Right

The counterintuitive solution is to shift many checks right, not left. We understand this contradicts prevailing wisdom, but great systems aren't built by counting believers—they're built through sound explanations.

Checks should be configured in IDEs or CI/CD systems so developers can request feedback when they want it, ignore it when they don't, and most importantly, choose not to run checks at all when unnecessary. When I'm validating a UI flow, I don't care about linting results. Running those checks wastes time. I should only compile and package the code. Developers should maintain full control. When people say break the CI/CD pipeline , or break the build if some checks fail, that goes against the principle of developer being in control. It’s some person some where who never write code is in control and its broad stroke control does not take into account that certain check might not be of any value in a particular context in which that specific developer is working

The obvious question: how do we ensure developers do the right thing if we do not punish them by breaking the build so that they can’t make progress without meeting the check? Lets check the objective. The objective of definition of done shall be complete before the release of code then the feedback shall be given during release approval, hey you are missing following checks or following checks are broken . The answer lies in data collection and visibility. All tools developers run, or can run, report data to central systems. During the final submission, when developers request their code changes be released—which should happen daily or at most weekly—the system displays everything that's been done with that code. Lines modified, commits included, any new violations added across static analysis, dynamic security scans, and other checks, which branch was used. Each item receives a color: red means you made the situation worse, orange means you didn't improve when you could have, green means you made it better.

When there's red and the developer either can't fix it or wants to move forward regardless, they must explain why and obtain approval from another developer—not a manager, not an authority figure, but a peer. This system consistently improves productivity, quality, and engineering effectiveness. The Release Checklist represents management with command but without control, with behavioral economics built into its design.

Why Release Checklist Works

The system is designed to support developers, not hold them accountable. It operates on the principle that people want to do the right thing when they know what the right thing is. But who knows which check to run when? Who knows the cost of running a check, particularly in time? Why hardcode checks where developers always pay the price?

Control removes power from developers, disempowering them while DevOps or release operators use their authority to implement controls. The Release Checklist doesn't give power to developers—it enables them to make informed choices. We should never "empower" people to design their own systems, which inevitably become messy because systems design is both art and science. We should enable developers, which is what the Release Checklist accomplishes.

Adoption takes time. Some may initially ignore the system. But during the release process, they see clearly how they've made systems better or worse. They see when they need peer approval because they can't do the right thing, or don't have time to do the right thing, like running expensive scans during incident management requiring rapid forward fixes. But they provide explanations.

These explanations enable better governance. Explanations might reveal that tools providing the checks are poor, full of false positives, or have availability issues. This creates information management can act on, making tool owners accountable. The worst thing any company can do is allow tool owners—typically technologists—to design techno-social systems. The Release Checklist allows systems designers to design, tool owners to provide tools, developers to be enabled to do the right thing before releasing with exceptions and explanations, and management to govern both by fixing bad tools and holding accountable the few genuinely poor performers.

Remember AI Won't Solve Broken Systems

We repeatedly emphasize: AI only works in good systems. If running all checks takes 70 minutes or two hours due to runs and reruns, or if checks execute even when unnecessary, AI is still wasting time. In another company with proper systems, AI can push that same code to production in 20 minutes. AI isn't a solution for poor systems. AI-driven cars won't work in developing-world cities like Delhi; they work well in developed cities like San Francisco. The difference is systems. AI can multiply efficiency and effectiveness, but only in systems that work—not in broken ones. AI can improve productivity , quality and engineering effectiveness of your systems multi fold but only and only if your systems are capable of supporting it. Just like giving a Ferrari to every driver does not increase systemic productivity in poor transportation systems, it only improves productivity and transportation systems that can support that speed, AI can only increase productivity , quality and reduce cycle time and increase throughput in SDLC systems that can support. It.

Case Study 1: Static Analysis as a Definition of Done Item

One organization posted the Definition of Done across all developer desks with a lengthy list. One item was static analysis using SonarQube. At one company, everyone was required to pass SonarQube, so every developer configured it for their repository. With 2,000 repositories, the configuration cost must have reached millions, though such costs never appear in P&L statements.

At another company, forcing SonarQube on every build added 10 to 15 minutes. Program managers created Excel spreadsheets and PowerPoint presentations showing CTOs which teams had violations. Under pressure, developers fixed some violations, but they kept reappearing. Many developers didn't care about fixing violations; a few cared, but the 15-minute tax on every build was prohibitive. Everyone blamed developers—developers even blamed each other. "Developers are not meeting Definition of Done requirements."

But was this accurate? Were developers unwilling to meet the Definition of Done, trying but failing, or simply not caring much? In reality, the latter two were true. SonarQube showed thousands of violations, and developers responded, "Not mine. I didn't create these. I don't have time. I'm already working more than ten hours daily." Static analysis metrics weren't improving.

The Release Checklist solves this problem at minimal cost through the power of systems design and behavioral economics.

At SystemsWay, we believe in release checklists paired with management education—both must go hand in hand. A release checklist is a list of checks developers should complete before rolling code to production, but we don't impose hard gates. Why? Because there are multiple scenarios during incidents where no one will care about the checks.

In our release checklist, we require another developer to approve deployment. This not only achieves SOC compliance but creates psychological pressure on two people to be extra vigilant. The two can decide when certain checks are incomplete but still allow rollout to proceed. There should never be hard blocks in techno-social systems—only transparent, open decision-making.

We create graphs and charts showing which checklist items are completed, incomplete, or failing—especially security scans. After six months of data, you'll realize many issues lie with tools themselves. Terrible scans, especially from information security teams. If InfoSec truly cares about security, they must partner with developers and question their tools rather than questioning developers. If hundreds of developers hate using particular tools, management should question the tool owner, not blame developers. If developers behave well with many checks but not with this specific check, what's wrong with your tool or system that's causing people to give up?

Release checklists expose both the few badly behaving developers and the abundant badly implemented tooling. Management can work on both to create a healthy culture. Management must understand that tool owners complaining about developers not caring doesn't make it true—generally, tool owners' systems are poor, not well-designed, and waste developer time.

All checks remain open, transparent, and overridable, with questions available later for specific cases by management, and for chronic failures, tool owners can be held accountable.

When we were called to address a static analysis problem, we added a release checklist to the system. We removed static analysis from pull requests and CI/CD pipelines, saving developers substantial time. When developers are writing code, they should focus on functionality and non-functional requirements only they can address—not static analysis. The concept of running everything in CI/CD pipelines is fundamentally flawed.

For every pull request, a separate process executed, and we wrote a specialized plugin that told developers the delta of violations: "There are 15,303 violations in this repository. You added three more violations." The system used color coding: adding any violations turned the check RED (shown in both the pull request and release checklist); ORANGE if violation count remained constant; and GREEN if at least one violation was removed.

The release checklist data was available to everyone. Everyone worked to ensure brand-new code didn't include new violations. In the AI era this can be automated, but we're demonstrating the power of thoughtful design and behavioral economics. No one acts on feedback about 15,000 violations they didn't create. Anyone will act on three violations they just added when memory is fresh and they can remove one more to achieve green.

This is what the release checklist accomplishes: non-blocking, instant, incremental, personalized feedback during release checkout, which must be explained to another person—but no blocks, while management monitors behavior. Massive improvement.

One day, a manager approached us with 27,000 static analysis violations, universal apathy, and a desire to fix it. Everyone offered the same CI/CD solution: blocking developers, holding them accountable. Developers on the team estimated 20 days to remove all violations. The manager asked, "Should I give them 20 days to fix the violations?"

I said, "No."

He was surprised, given how much I care about code quality. My explanation: once they fix it and reach zero, what happens next? Violations start piling up again because they're fixing under force, not performing self-cleanup. There are reasons they don't self-cleanup, and there are ways to achieve it.

He asked how. I explained I would show him, but first we should enable this feature: adding new violations equals red, maintaining count equals orange, reducing by one equals green, without blocking builds.

Three months later, I met the manager. "How do the violation numbers look?" I asked.

He said, "It's 27."

I said, "Still 27,000? I don't believe it."

He said, "Just 27."

I said that was expected. The manager asked how I knew.

"I understand how developers and their psychology work," I explained. "By creating this non-blocking system, developers must have stopped adding new violations because they can't defend that behavior. They might not have addressed existing ones. They might have even fixed more than they added to achieve green—we love the color green. It's like the Boy Scout principle: leave the place cleaner than when you arrived."

But the manager said, "That doesn't explain the 27,000 reduction. How did you know it would fall so low?"

"Because of how engineering teams work. In a ten-person team, three people care about code quality; the others don't care as much. But when the other seven don't care and keep polluting, the other three give up—it's a never-ending game without success. Once we closed the door on new violation creation, it empowered the three developers who care. They would, in a day or two somewhere, complete all the cleanup because they know the cleanup will endure. That brought it to zero. The reason it's 27 is because they determined those violations were false positives not requiring fixes."

The entire approach must be recognized: not everyone cares about everything in a company, but a few do care. Systems must be designed so power tilts in favor of those who care, not those who don't—because the latter will ensure everyone becomes helpless against the system.

This same strategy applies to different security scans, but InfoSec generally wields a big stick uniformly across all developers and product organizations, instead of thinking: first, how to stop the leakage; second, how to design systems that tilt in favor of those who care. Holding accountable people who don't want to sabotage systems but don't care is misguided—especially when the transaction cost of caring is so high that even people who want to care won't.

Case Study 2 : Forcing developers to fix bugs

In two companies where our people worked, the number of bugs was growing, SLAs (Service Level Agreements) were being missed, and everyone was worried. A major executive meeting was called, and everyone agreed that the company culture prioritized releasing features over fixing bugs—though each person had their own explanation. Some brave individuals were even able to blame the executives themselves for pushing feature rollouts over bug fixes.

Eventually, the question became: what should we do? They considered several solutions—offering monetary incentives to fix bugs, reducing pressure for feature delivery, holding town halls, and providing education. But everyone knew none of these would make any real dent. Still, something had to be done.

One VP came up with a solution: What if we block production rollouts of any component that has an out-of-SLA bug assigned to it? For example, if a bug had 15 days to be fixed and remained open after that period, the component couldn't be deployed.

We advised against implementing this policy. Generally speaking, creating and enforcing policies is hard, but something has to be done. No one was open to hearing the problems with this approach. Their response was, "Do you have a better solution to get developers to prioritize bug fixes over feature releases?" We admitted we didn't—but just because we lacked a good solution didn't mean we should implement a bad one.

In one company, they couldn't implement the policy because they didn't have bugs assigned to components and lacked a structured release process. After four months, a reorganization happened and this failed implementation was forgotten. To us, the company inadvertently saved itself from a bad policy through sheer incompetence.

In the other company, the policy was successfully implemented. The first issue they encountered was that a code rollout fixing a bug was blocked because the component had an open bug—the very one the code was fixing. After resolving such issues and others like it, code rollouts continued to be blocked. Within three months, there was so much uproar from business and product teams to remove this check that the company started losing business to competitors.

The point is this: when bad behavior is widespread—like people not fixing bugs—holding people accountable through mechanisms like blocking code rollouts is a terrible idea. You need to fix the systems instead.

Did we fix the bug management systems in these companies? We did in one of them. Once we fixed the bug management system, the bug problem went away. But remember: we fixed the systems, not the people. As we say, if Manish drives amazingly in the USA but terribly in India, it's because of the systems. Trying to "fix Manish" in India is a bad idea for improving driving; the good idea is to fix the transportation system that causes people to drive badly.

Key Principles

No Hard Gates: There should never be hard blocks in techno-social systems, only transparent decision-making
Behavioral Economics: Design systems that understand human psychology and incentives
Empower Those Who Care: Tilt the system in favor of developers who care about quality
Incremental Feedback: Provide personalized, actionable feedback on changes developers actually made
Stop the Leakage First: Prevent new problems before trying to fix all historical ones
Question the Tools: When developers universally dislike a tool, question the tool, not the developers
Transparency Over Enforcement: Make data visible; let social pressure and pride drive improvement

The release checklist is not just a list—it's a philosophical approach to SDLC that recognizes systems are techno-social, that people respond to incentives, and that sustainable improvement comes from empowering those who care rather than punishing everyone for system failures.

How to create Release CheckList system in your SDLC systems

Release checklist systems is a powerful system which developers love ,makes tools owners responsible and accountable for tools quality, lower management overhead to meeting code quality goals, reduces production incidents and bugs. It’s good all over the place. But how can one get the release checklist? No good thing is that easy to get. Anyone who uses SDLC THAT WORKS , provided by us, gets a release checklist for free from day1. That release checklist where we shift right instead of left, works only after we educate the operators, nudger and designer of SDLC on how to operate our SDLC right. For existing SDLC we suggest that designers of SDLC start with assessment of their SDLC and training in Systems Thinking for SDLC designers , because you can have world class SDLC but the requirement is that you complement your analytical thinking with systems thinking. SDLC.Work can help you in starting transportation and also provide you courses with the help of our Education pattern SystemsWay school of management and leadership that teaches courses in Systems Thinking, a rare school that teaches Systems Thinking course for technologists.

Reach out to us at [email protected]