IT must set clear policies and understand the costs of big data initiatives before uncharted experiments lead to data center chaos.
As the buzz around big data gets louder, the business stakes are rising. Management looks to IT and the data center to use big data to improve efficiency, enable more effective planning, and provide better service. Expectations also include higher margins and new sources of revenue.
These are all good reasons to consider a comprehensive big data initiative. But as with any major data center initiative, success will depend on the work done prior to any actual deployment. Before operations and marketing teams are seduced by the big data possibilities, IT should insist on a process that will minimize the risks to existing data center operations and infrastructure.
Big data initiatives must start with and remain subordinate to business needs, so step one should be to survey the business. Identify the data assets and target the business processes or decisions that would benefit most from big data analytics.
Now IT can get started, right? Wrong. Step two: create a data policy. Once a business becomes data-driven, the value of the data skyrockets. Information must be secured and managed appropriately, and you have to consider the privacy issues. Define a big data policy that includes guidelines for where the data will be stored and which business roles will have access to which information. Spell out how data will be backed up and how long information will be retained.
With a policy in place that introduces guidelines and rules, IT can start to evaluate the requirements for big data analysis, propose a system, and project the costs for that system.
That last step -- estimating the costs -- encompasses the usual procurement and deployment costs as well as an assessment of the total cost of ownership for the analytics applications and any new hardware required. And, of course, you have to estimate the costs of collecting and managing data.
It's hard to predict, but the impact on the existing data center also must be considered. Big data and analytics have the potential to quickly become a significant burden. Cloud storage and hosting services can help, but compliance requirements, privacy concerns, and other factors can require that some, if not all, data repositories must remain in the data center. Either way, IT must calculate the costs of scaling compute, file-server resources, and storage capacity.
What about energy and cooling? The costs of powering servers and storage platforms and related cooling systems have become a major component of the modern data center budget. On top of rising energy costs, energy availability is also becoming an issue for the largest data centers. Utilities that manage aging power grids are limiting the amount of energy some data centers can purchase at any price.
Fortunately, energy efficiency concerns are not being driven by big data alone. Today's servers and data center equipment offer fine-grained, real-time power consumption and operating temperature data. Data center infrastructure management middleware and management consoles automate the collection and consolidation of this data.
Big data and compute-intensive analytics applications definitely warrant a closer look at energy-management platforms. Consider a system that can consolidate collected data into real-time energy and thermal maps of the data center. A single-pane-of-glass view of the data center can help spot idle servers that can waste as much as 15% of available power.
Besides improving energy efficiency and server use, an energy management platform can improve server reliability. These systems can also identify hot spots early, avoiding damaged equipment and related outages that can disrupt business.
In summary, businesses should only unleash big data once they have a full understanding of how it will change the data center. This means identifying the requirements, goals, and data policies up front. And it also means understanding the effect on the use of data center resources.
Big data can definitely yield breakthrough insights. With the right underpinnings, it can do so without breaking the energy budget or putting the infrastructure at risk.
IBM, Microsoft, Oracle, and SAP are fighting to become your in-memory technology provider. Do you really need the speed? Get the digital In-Memory Databases issue of InformationWeek today.
Jeff Klaus is the General Manager of Data Center Manager (DCM) Solutions, at Intel Corporation. He can be reached at Jeffrey.S.Klaus@intel.com. View Full Bio
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.