In the DevOps culture, failure isn’t exactly encouraged. Organizations want to succeed. They want every build, project and long-term initiative to achieve its individual goal. Still, DevOps failures are embraced as reality. If you’re going to fail, the thinking goes, make sure you fail fast, learn from it and leverage the newly gained insights in improving the process and product.
With this philosophy as a backdrop, we looked at a few common DevOps failures cited by CloudBees personnel and tried to draw some lessons from each one.
DevOoops #1: Unforeseen lock-outs
Name: Carlos Sanchez
Title: Principal Software Engineer
When you’re automating deployments using configuration as code, make sure you have the right protections in place (i.e., validation of configuration). Otherwise, it’s easy to lock yourself out, forcing yourself to manually log in to each machine to fix it when a bad change is pushed and deployed to all the machines.
Takeaway: As you bake quality into the processes and shift automation left, you need to make sure you validate all changes early and often.
This situation highlights how organizations need to implement proper processes, not only for application code changes but for infrastructure as code changes that manage environments. In automating in pursuit of DevOps it’s necessary to have a strategy for credentials and secrets management of permissions for your DevOps and production systems.
DevOoops #2: Deploying out-of-step
Name: Laura Frank Tacho
Title: Director of Engineering
Running in parallel during CI/CD can speed up your automated testing and shorten your feedback cycles. Just make sure deployment is not executing as a parallel step. That misconfiguration will cause your code to deploy at the same time the tests are running, regardless of their exit codes, pretty much defeating the purpose of automated testing before a deploy.
Takeaway: This story highlights how crucial it is to ensure that pass through certain gates and checks before being deployed out to the customer. You need to construct your CD pipelines so your parallel testing is managed as a gate to production, ensuring that properly tested and validated changes are not automatically deployed.
Another thing: While DevOps maintains that you should be able to comfortably deploy at any time, deployment needs to be treated as a managed and deliberate event. Deployment happens at the end of all the processes within a pipeline, and all required tests, validations, checks and approvals have been done before any change is deployed to the customer.
DevOoops #3: SCM conflicts
Name: Will Refvem
Title: Solution Architect
One of the biggest challenges in DevOps is SCM. Early in my career, being given access to Git repositories was a bit like handing a belt-fed machine gun to a drunken toddler. My biggest DevOops moment came when I was trying to master the basics of Git in a project that used Git submodules (don’t -- just don’t) and had a series of bash scripts that implemented a primitive CI/CD workflow by running in the environments we were deploying to (seriously). I can’t tell you how many merge conflicts and detached HEADs I encountered, or how I got out of them. Fortunately, we were using the fork-PR model, so I did no real damage.
Takeaway: A proper branching and merging model is required to ensure that you are feeding the appropriate inputs into your DevOps pipeline to let you continuously integrate, test and ultimately deploy changes.
DevOoops #4: Poorly defined KPIs
Name: Juni Mukherjee
Title: Product Marketer
How we define KPIs can change the game. For example, one quality engineering team defined the number of tests executed per sprint as a success metric. Humans are driven by incentives and the natural tendency of the team was to add more and more tests, without even considering archival of outdated ones. Think of this; we could have a larger impact with fewer tests and, more importantly, fewer tests lower the test cycle time. So, instead of sheer quantity, we should focus on the coverage and effectiveness of tests.
Similarly, a release engineering team defined the number of releases per sprint as a success metric. Number of releases reflect on velocity for sure, however, releases move bits from point A to point B without assessing the value added to business. So, it is critical to tie a business KPI (like, number of new customers acquired, percentage of revenue increase, etc.) to velocity, such that we know we are investing in valuable speed and not suicidal speed.
Take away: This is a critical, and often underemphasized, lesson in DevOps. You’re not implementing DevOps just to say you’re doing it. You need to develop goals and objectives for your initiative and align to them. Once we define these goals, we need to align KPIs and success metrics to those objectives. Properly defining these success metrics and KPIs will dictate what gets prioritized in when building pipelines
DevOoops #5: Failure to address organizational blockers
Name: Viktor Farcic
A real obstacle is cultural and caused by having siloed departments. When we start looking at a system as a whole, and detecting areas of improvements, the answer tends to be “this is not our department, we don’t know them, they won’t speak with us”. Those siloes are the main argument in favor of DevOps; it tries to remove them. But, in reality, most companies fake it by creating yet another silo that is, this time, called the DevOps department. DevOps in practice directly contradicts DevOps ideas.
Takeaways: Organize your teams around subjects like product and functionality and include all the stakeholders in the software development process in these product or feature teams. This better aligns stakeholders and prevents the “us vs. them” tribal mentality that often plagues traditional software development.
Culture in DevOps is just as important as process, practices, tools and technology, yet it’s one of the most difficult things to change.
Like any sophisticated process, DevOps is going to experience its share of failures. There will be missed hand-offs and miscommunications along the way. Organizations also will make mistakes in the way they implement their overall DevOps initiatives.
These “DevOoops” stories highlight how DevOps transformations are hard to pull off. There will be issues, but they’re solvable. You need to approach your transformation as an iterative process of learning and improvement. If you fail occasionally, keep going. You’ll succeed in the long run.