Big Data. Big Decisions
InformationWeek
Special Coverage Series


Google Vs. Zombies -- And Worse

Take an inside look at how Google prepares for catastrophes, from flesh-eating zombies to earthquakes. This disaster prep team keeps it real and keeps it interesting.

Anonymous: 10 Things We Have Learned In 2013
Anonymous: 10 Things We Have Learned In 2013
(click image for larger view and for slideshow) \
After the zombies took over Google's data center, the heroic action of a few selfless individuals saved the day. Never underestimate what a site reliability engineer can do with an axe.

"If you look at zombies in the data center, they're after the people," explained Kripa Krishnan, technical program manager at Google. "So it becomes less of a machine's problem and becomes more of a people problem..."

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

The zombie invasion occurred back in 2007. It was one of the first Disaster Recovery Testing (DiRT) events created to evaluate Google's operational resilience in a crisis. This was before Centers for Disease Control and Prevention began warning about zombies because storms, pandemics and earthquakes don't get people's attention anymore.

Although heroism has played a central role in saving Google more than once -- another scenario involved an executive wielding the teleportation gun from Valve's Portal -- it's not something that can be relied on when disaster strikes, just like any IT system or business process at a time of crisis. Google as a company promotes the perception that its employees are exceptionally talented. But when it comes to preparing for the worst, the company can't simply assume that exceptional skills will save the day.

[ Could what you wear be used to identify you in the future? Read Google Funds Fashion Recognition Research. ]

"We find that people are people, and they burn out if they work insane hours and long shifts," said Krishnan. "Heroic tactics are not a sustainable model if you're in a disaster."

The DiRT program was created seven years ago and Krishnan began managing DiRT events a year after that. Genial and sharp, with a penchant for using the word "goodness" to emphasize a point, her background recalls the famously overachieving Buckaroo Banzai, depicted in the 1984 movie that bears his name as a neurosurgeon, physicist, rock musician and test pilot.

Hyperbole perhaps, but it's a necessary element in a story about heroism. Krishnan was studying medicine over a decade ago when her interests took her to music and theater. Three years in, she decided to study performance arts, and eventually came to the U.S. to focus on theater. Then a professor convinced her to take a computer science course. Having left science for the arts, Krishnan finally emerged from graduate school with a degree in Management Information Systems. Thereafter, she became involved with telemedicine networking in Kosovo and later landed at Google.

Now her job is to break things, as Krishnan explained in an interview at Google's Mountain View, Calif., headquarters.

"Sometimes we will bring in someone to write something that will cause a failure in some underneath layer and it will manifest itself as cascading failures in some front-end facing product," Krishnan said. Other times, she says, her team might direct someone to introduce corrupt data into a system, to see how long it takes to find the problem.

DiRT is an annual exercise. Although various Google product groups conduct their own internal stress tests, DiRT's scope is companywide. DiRT scenarios challenge both technical infrastructure and organizational dynamics. Initially, the tests were restricted to user-facing systems, but they have been expanded to cover the full range of Google operations. Beyond data centers, DiRT testing might include systems used by facilities, finance, human resources and security, among other business groups. More recently, as the company's enterprise business has become more successful, customer support systems were added to the tests.

DiRT exercises require the work of hundreds of engineering and operations employees for several days, which means they're not inexpensive to run. They can affect live systems and have even resulted in revenue loss. But the price is deemed to be worth it.

Sanjay Jain, associate industry professor in the department of decision sciences at George Washington University, said in an email that the apparent increase in manmade and natural disasters around the globe demands more active continuity planning.

Google Dirt Conference Table

"Recently, companies have had to face major issues due to disasters including the loss of operations in New York and New Jersey area following Hurricane Sandy a few months ago, and the major impact on supply chains following the tsunami in Japan in 2011," he said in an email. "Companies need to be more thorough in planning for safety of their personnel and maintaining business continuity in face of such eventualities. Such efforts have to go beyond duplicating data servers (that is of course needed) to employing live and computer simulations of potential disaster scenarios and their impact on companies' personnel, operations, and assets, and testing of measures to eliminate or substantially reduce the negative impacts."

In case of emergency, Google has a war room. DiRT tests are run from a simulated war room, which can be one of the company's many conference rooms.

 1 | 2  | Next Page »


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.