Delivery and Release Checklist
This checklist covers topics related to the delivery of software into production.
Have we arrived at a shipping / release cycle that is suitable?
Different application types call for different release cycles. Live services can have incremental rollouts of bugfixes without much ceremony, and therefore can be much more iterative. Large software packages require far more release planning and market communication, however a regular release cycle (calendar releases) allows the organisation to organise itself around predictable timeframes.
Prefer regular releases to full blown continuous deployment/delivery into production (releasing every change into the full production traffic). Releases are an opportunity to group major feature deliverables across multiple teams into release notes. Continuous deployment into a staged environments is fine but continous delivery into production is overly risky and probably unnecessary. Continuously deliver into a production-like environment then release or otherwise phase in the new release once it’s ready (performance/regression testing complete). Don’t be too zealotic about the continuous delivery mantra.
What criteria do we have to understand if the user experience is good? Are we continuously monitoring the critiera?
How do we know that the users are getting a good experience? Are they able to execute their user-stories with low latency responses, does a journey involve fewer than X interactions / request? Take time to workout what success means and monitor it.
Are we taking false positives seriously?
Teams will become quickly desensitized to an alerting system that over-alerts. Measure the number of false positives and report on them. Dedicate time to reducing them since they have a large negative effect on the effectiveness of the alerting system.
Is the system health highly visible at all times?
So called information radiators are excellent ways to keep teams alert to system health. (TODO: Remote teams?)
Can we support version roll-back? Is this tested and proven?
Rolling back is as important as rolling forward. Do we know which revision was previously in production/released Do we all understand how to quickly roll-back a production service? Which dependencies also need to be rolled back and in which order?
Do we support both forward and backward compatibility on every change?
Backwards compatibility is a design that is compatible with previous versions of itself. Does the new version break an older dependency, such as a client? Forwards compatibility is a design that’s compatible with future versions of itself. Does the release accept additional data/parameters without balking?
If the answer is no to any of these we must be sure to understand why.
Do we have a process in place to catch performance and capacity degradations in new releases?
How do we know that the new release doesn’t introduct performance problems across its various deployment scenarios? Do we have a non-functional testing phase that covers
Are we running tests using real data?
Synthetic data is notoriously poor and simulating production traffic.
Do we have (and run) system-level acceptance tests?
Do we have an environment that lets us test at scale, with the same data collection and mining techniques using in production?
Are we continuously delivering software using CI software (Jenkins/Travis)?
Are we monitoring pipeline statistics and reporting on failure frequency and broken build durations?