Extracting accurate and meaningful answers from big data is tough. It's often made more challenging given the way big data software developers and IT operations lack coordination in many enterprises. Even though an IT organization may practice sound DevOps strategies for other supported applications, big data projects often remain siloed for a variety of reasons.
Today we're going to look at what DevOps is and why many big data project teams choose to not use DevOps methodologies. We’ll then move on to the benefits that DevOps can provide, as well as any challenges that might be faced along the way when moving big data to a DevOps process model.
But before we do that, let's first take a step back to define what DevOps is, and learn why it’s become so popular. The idea of DevOps is to tear down the silos between software developers and IT infrastructure administrators to make sure everyone is focused on a singular goal. A bit of cross-training is required on both sides of the house to the point where processes and terminology used are understood by all. Then once training is complete, clear lines of communication and direction can be established with an aim of continuous improvement. Both teams work in tandem to test environments, tune production infrastructure components to meet new software requirements -- and ultimately -- bring software fixes and features to end users more rapidly.
Why big data formed without DevOps
The complexity of big data sciences -- and specifically the analytical sciences portion of big data -- steered many IT leaders to abandon the DevOps processes and procedures that they use with other applications that the department supports. For those that are performing data analytics in-house. The field of data science is a new internal position that is foreign to many IT professionals. Therefore, analysts and big data developers formed their own group apart from the operations side of the house. This separation of functions is how many big data still operate to this day.
Why big data needs DevOps
Because of this separation, the same inefficiencies and bottlenecks that were solved with DevOps practices in other applications, are showing up in big data projects. In fact, the issues are being compounded. Since some big data projects are more challenging than originally expected, IT leaders are under increased pressure to produce results. This forces analytics scientists to revamp their algorithms. These major changes in analytic models often require drastically different infrastructure resource requirements than was originally planned for. Yet, the operations team is kept out of the loop until the last minute without proper collaboration. Then, when infrastructure change requests do finally trickle in from the developers, the lag in communication and resource allocation coordination slows down progress. This slowdown can affect any potential competitive advantage that big data analytics can provide. This is precisely why a DevOps model is needed.
Challenges when integrating big data and DevOps
If you decide to move your big data projects to a DevOps model, be sure to understand some of the challenges you will face along the way. For one, the operations side of the house must get up to speed in terms of their depth of knowledge regarding big data platforms and how analytics models are implemented.
[New to the DevOps concept? Check out Untangling (And Understanding) DevOps.]
Additionally, keep in mind that your analytics professionals think of themselves more as social engineers as opposed to data engineers. So, they’ll have some learning of their own to do. Next, the magnitude of potential scalability regarding compute and network resources can be on a scale never before seen in another production application. Therefore, if speed is a critical part of your DevOps plan, then resource coordination is going to be of utmost importance. Finally, understand that additional human resources will be required to make a big data DevOps run as efficiently as possible. DevOps isn’t about employee reduction, it’s about getting more out of your apps.
The benefits of mating big data with DevOps far outweigh any integration challenges. The efficiencies and coordination benefits help to streamline processes which speeds up the ability to make analytical changes on the fly to get more out of the data being mined. It might be just the trick to finally turn your fledging big data project around.