After listing conventional failed approaches and the guiding principles in the previous parts, this part will bring this short series to a close with a suggested approach to start managing your non-feature work in a healthier manner. This is not the one approach to rule them all and should not be treated so. It’s a hybrid of different methods I’ve seen personally and heard of and can be used as the initial step in your quest to create a world-class engineering team.
First, the most critical step is to accept that quality is the responsibility of the entire team and should be owned by the whole team—not just the engineers. Unless we’re talking about tiny and minor bugs that take a few minutes to fix (e.g., typos in an email), bugs should be handled like all other tasks are. That usually means that bugs are part of the Product backlog and are not up to the engineers to try and find time for them (or fight for time to fix them). That way, the impact of the team’s quality bar is crystal clear to everyone, and all user-impacting work, be it new features or bug fixes, goes through the same decision pipeline.
And so, quality tasks should be scheduled as part of the ongoing work (e.g., as part of your sprints or iterations), but without reserved time allocation, that means Engineering has to manage that backlog. Every iteration, the team considers the impact of the different items in the backlog and prioritizes them together.
Next, things get a bit trickier as we move on to technical tasks—tasks whose impact and value to the business are not immediately apparent. For handling these, we will have to name different types of technical tasks and discuss each separately.
Technical routine maintenance tasks are work items that show up regularly to keep the system you already created in running order. Typical examples are renewing a certificate, upgrading a package for security reasons, minor changes to support a dependency whose API has changed. These should be treated as bugs and so scheduled as part of the regular work and kept on the regular backlog. However, as this is purely technical, the onus shifts to the engineers to clearly communicate and express the significance of the work and time constraints. As mentioned in the previous part, for this to succeed, everyone involved should treat the other disciplines as equal partners. That means that Engineering cannot strong-arm whatever they want into the sprint by saying, “it just has to happen.” Explanations on impact and real timelines are the way to go here (e.g., “no, we don’t have to do it tomorrow, but the certificate will expire in a month, and so this has to happen before”).
Discipline is also required here to make sure that there’s no scope creep. I don’t know how many times I’ve seen a trivial and menial task slated for much longer than it should. That’s due to the tendency to think “while we’re at it” and pile on more work to these service tasks. The rule should be that these tasks should be constrained to the maintenance work required, and that’s all (we will shortly address the other tasks). If you adhere to this rule and communicate well, these tasks become a non-issue.
Tech Capital and Debt
For effectively handling the rest of the tasks, we now need to address another couple of principles, and those are the principles of engineering autonomy and not being able to sprint indefinitely. Handling technical debt and investing in technical capital is a crucial part of creating a stellar engineering team. On the one side, significant investments of time have to be done with business ROI and impact in mind, and so should involve Product. On the other hand, engineers require autonomy to tinker and play and assess what even are the more significant investments required. There should also be sufficient autonomy to allow them to clean up things from time to time (the classical “we should add tests to that module”).
My suggested framework for handling these is to incorporate intermissions between sprints. Some companies call these sabbaticals. They can be three days after every two-weeks sprint, a week after a six-week sprint, or a week every several sprints. I don’t care, and companies should experiment and adjust the frequency and amount of time to what fits their teams best.
Successful and productive intermissions require a few ground rules:
- Intermissions must be regularly scheduled. The team has to know when these are coming in advance and regularly so as to create a habit of taking the time to invest in themselves and their environment (as opposed to the one-off “hackathon”). Further, a repeating block of time on the calendar has the magical capability of being adhered to. The intermissions should be negotiated on a one-by-one basis but accepted as part of the work process.
- Intermissions are not “buffer time.” Tasks from the recently concluded sprint should not leak into this time.
- Intermissions don’t have to be synced across the entire Engineering team and can be incorporated in a “staggered rollout” across teams.
- Work scheduled during intermissions is up to the engineers.
- Intermissions are not “breaks” or “personal 20% time.” Work should be done, and engineers are expected to make good use of the time (e.g., teams should still hold a daily standup).
Once you have regularly occurring intermissions, they should be used to perform a lot of the remaining non-feature work. For example, if someone wants to add tests, prettify the logs, or consider switching a library you depend on. It’s also an excellent opportunity for working on the internal tooling, learning a new framework, and so forth. The interesting remaining bit is handling bigger projects, those that would not fit in a single week or so.
Let’s consider a couple of examples which are an amalgamation of real-world cases I’ve seen recently:
iOS 14 Preparation
Companies that have mobile apps are accustomed to the regular cycle of seeing what Apple announces during WWDC, and then allocating some of their time during the summer to incorporate what matters most to them in anticipation of the public release. Intermissions are a great time to watch the talks, perform a few POCs, and learn the impact of the different new APIs and capabilities. Then, some of the non-feature work during the summer will be devoted to implementing the learnings (e.g., changes to the way you can test in-app purchases).
Moreover, the first intermissions might result in exciting feature work—new capabilities that can make the product better, and that should be discussed with Product. The remaining case would be big architecture changes that would require a big effort, like moving to Swift UI (or, a few years ago, moving to Swift). Once the kickoff has been done in intermissions (initial research and planning), if the change should be performed at a faster pace than that which the intermissions cadence would allow, they should be planned with Product. Define the benefits and risks to help prioritize this work, along with everything else that’s on the roadmap.
An unfortunately too common of an occurrence is the team that realizes every 2.5-3 years that it should rewrite its frontend in a new framework and architecture. For this discussion, we’ll assume that this rewrite is really necessary (e.g., AngularJS 1 dying and not being supported). Again, the kickoff stage would typically happen during a few intermissions. Research about the right new framework to choose and laying the groundwork to allow the development of a hybrid app using both old and new tech in parallel.
Then, planning the roadmap should be done in a matter that, in every sprint, a decision is made with Product about the best opportunities to continue along with the rewrite project. Some sprints it might mean that a couple of components that are changed anyway will be rewritten. Other sprints it might mean that someone will only have a couple of days to handle an old part of the code.
The key aspect in both of these cases is that the team will perform the more prominent projects in a way that allows it always to be put on hold if needed (e.g., an opportunity that requires complete focus). That’s the purpose of the research and groundwork during the first intermissions of starting such an effort. Further, making it so that Product and Engineering have to discuss the value and current tradeoff regularly means that a proper tension can be kept.
For example, most rewrites stop hurting after the 80% mark, where the team essentially stops touching all of the leftover legacy code. At that point, continuing to invest 25% of your Engineering capacity to it out of habit might not make sense, and moving the remaining clean up back to the regular maintenance done in intermissions might make more sense.
This trilogy is the result of an ongoing process, working with many different companies. You should tailor it to your specific situation and culture. My suggestion to you would be to think about how you can apply this described approach to your current organization. Wherever changes are needed, you should consult the principles in the previous part.
As I will continue learning, I will do my best to update these articles, and any input, remarks, or questions are welcome. Thank you for following along!
Get the TechExecOS
Get the best newsletter for tech executives online, along with special free events before the launch of The Tech Executive Operating System book 📖. Tailored for your daily work. Weekly, short, and packed with exclusive insights.