Cloud // Cloud Storage
News
5/14/2014
10:26 AM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Social Science Site Using Azure Loses Data

Dedoose, a data analytics system, suffered a failure on Azure that may mean three weeks of lost data for customers.

8 Data Centers For Cloud's Toughest Jobs
8 Data Centers For Cloud's Toughest Jobs
(Click image for larger view and slideshow.)

Dedoose, a social science data company, has lost some of its customers' data while operating on the Microsoft Azure cloud. "This is a horrible moment for our company. We have never lost data or had a breach," said Dedoose president Eli Lieber in an interview.

At best, it will be able to restore data stored through April 11, Lieber said, and perhaps only up to March 30. For an uncertain minority of customers, data added to their accounts after April 11 has been lost, said Lieber.

"It's impossible to say" how many customers have lost data because the firm doesn't monitor how customers are using their accounts, he continued. "Only users can assess losses in their projects, not Dedoose," he said, but the company is doing everything it can to help them recover data.

Dedoose.com, founded in 2006 in Manhattan Beach, Calif., provides social scientists with an online analytics application, EthnoNotes, used by both commercial and academic researchers. Lieber said the application is used for marketers, pharmaceutical company researchers, academic social scientists, and others.

[Want to learn more about a documented failure in Azure operations? See Microsoft Pins Azure Slowdown On Cloud Software Fault.]

Dedoose offers its EthnoNotes application entirely on the Microsoft Azure cloud, and offers Gladinet.com, a free cloud storage service, as backup storage. Many customers put data into Word documents and Excel spreadsheets. Through its consulting services, Dedoose will help customers retrieve that data and rebuild their databases when other sources aren't available.

The Los Angeles Times reported that Dedoose officials sent an email Monday saying that both its operational and storage systems had failed. "The timing of this event was such that our entire data storage container was corrupted -- including the master database and all local backups," the company wrote in the email.

"Within minutes of discovering the problem, we contacted Microsoft Azure support. Unfortunately, Microsoft was unable to recover these data... from its servers," the Dedoose email to customers said. Microsoft officials couldn't be reached for comment.

"Since our initial communication about this issue, we have also learned that the backup files stored to an independent location were also corrupted. We are working with Gladinet services to have the non-corrupted data transferred to alternative storage locations and restore the integrity of these files," the message continued, according to the L.A. Times.

Asked who was responsible for the data loss, Azure or Dedoose, Lieber said: "It's a complicated situation." Dedoose produced its EthnoNotes system and installed it on Azure, but didn't expect that its data could be lost when it failed. He declined to estimate the respective shares of responsibility. Dedoose is still investigating why the system failed. Lieber is one of the authors of the EthnoNotes system.

"Things happened on the platform that we were blind to. Similar events have happened (to other Azure customers) and the data was not recoverable," Lieber said, without naming any other Azure users who suffered a similar failure.

Dedoose is revising its all-cloud approach to include a redundant system separate from the operational one on Azure. "We are putting measures in place so that if the platform is destroyed again, we will live on," said Lieber. "Within two weeks, our systems will be tremendously more robust."

But he added, "Ultimately, we are the responsible party. As we learn more about what happened, we are taking steps to protect our users." Lieber declined to say how many customers were affected or how many customers in total use EthnoNotes. But he said the number affected, while a minority, is a higher than the "dozens of customers" suggested by another description.

Are you better protected renting space for disaster recovery or owning a private cloud? Follow one company's decision-making process. Also in the Disaster Recovery issue of InformationWeek: Five lessons from Facebook on analytics success. (Free registration required.)

Charles Babcock is an editor-at-large for InformationWeek, having joined the publication in 2003. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
Page 1 / 2   >   >>
Andre Leonard
50%
50%
Andre Leonard,
User Rank: Apprentice
5/14/2014 | 1:04:58 PM
Catastrophic Failure..
There are certian events in busienss which are catasrophic. You reach a point you can never recover. This is one of them. Learn from your mistakes, rebrand yourself and move on.
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
5/14/2014 | 1:35:27 PM
Rumors of its death may be highly exaggerated
I don't know how recoverable the data is by users themselves. This is social science project data, not necessarily business transactions with customer accounts screwed up and revenue lost. I suspect some of it is recoverable or able to be reconstructed by other than the usual backup and recovery means. But I don't know for sure. The company may survive this because of the value of its analytics application, EthnoNotes. 
JoeEmison
50%
50%
JoeEmison,
User Rank: Strategist
5/14/2014 | 3:56:54 PM
Seems a bit disingenuous to name Azure so prominently
I don't think this article would have been posted without Azure being in the picture, yet it seems fairly clear from the description that Dedoose was treating Azure like a regular VPS/dedicated hosting provider instead of actually architecting for the cloud and failure.  "Storage and system went down at the same time?" I don't think you're supposed to be shocked about that if you have a clue what you're doing--that's a bit like saying, "My laptop was stolen AND so was the hard drive inside it".  Likewise, there's zero excuse on a provider if you aren't testing your backups.  You should constantly be checking your backups.

Perhaps the only reason to feature this type of incident and highlight the "cloud" aspect of it would be to point out that companies are still completely clueless about how to architect their applications for the cloud--and, in particular, blame their cloud providers for issues that are 100% their own fault.  (And absent any additional information, that's the right way to apportion blame here: 0% Azure, 100% Dedoose).
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
5/14/2014 | 4:23:53 PM
It was all Dedoose's fault? Hmmm. Maybe....
Strong statements here from Joe Emison (we expect no less). But I'm not adopting the 0-100% split between Azure and Dedoose until I know a little bit more. For example, was any monitoring available to indicate the system was about to fail? I've asked Microsoft to respond. Still waiting. 
HichemS973
50%
50%
HichemS973,
User Rank: Apprentice
5/14/2014 | 8:39:23 PM
Important: Not an Azure Failure based on Deddose Saying But it was an application issue on their side
http://blog.dedoose.com/2014/05/dedooses-black-eye-crash-and-recovery-efforts/

This devastating system 'collision' of Tuesday night resulted from an series of events leading to the failure of Dedoose services running on the Microsoft Azure platform.  To be clear, Dedoose services failed, not Azure.  In short, work done on one aspect of Dedoose led to the failure of another, cascading to pull down all of Dedoose.  The timing was particularly bad because it occurred in the midst of a full database encryption and backup.  This backup process, in turn, corrupted our entire storage system. Our immediate work with Microsoft support did not result in any substantial recovery.  Here's where we are and where we're going:

 
moarsauce123
0%
100%
moarsauce123,
User Rank: Ninja
5/15/2014 | 12:26:24 PM
Welcome to the cloud!
All eggs into one basket without a backup and any means to control the infrastructre...I blame Dedoose squarely for the data loss. Anyone who has any experience with networked systems and server hardware should know that failure is always an option and it is unfortunately fairly high up on the list. Anyone who deploys cloud services without having a reliable plan B is just asking for well-deserved trouble.
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
5/15/2014 | 2:47:30 PM
Dedoose implements better data protection on Azure -- and Amazon
In a blog May 9 following up the crash, CEO Eli Lieber said Dedoose was implementing the following for better data protection:
  1. Deploying a database mirror/slave in Azure
  2. Deploying a database mirror/slave into Amazon S3
  3. Keeping a mirror copy of the entire blob storage including all file data, backups, video data synchronized nightly to our private server in an encrypted volume
  4. Storing nightly database backups on the VHD, Azure Blob Storage, and Amazon S3 Storage
  5. Mirroring all Azure file data into an Amazon S3 bucket
  6. Carrying out a weekly restore exercise for the database backups to ensure integrity

 
Dedoose
50%
50%
Dedoose,
User Rank: Apprentice
5/15/2014 | 4:25:56 PM
To clear a few things up
 We are terribly sorry about this event. Dedoose is built and designed by well known academic researchers. We are not a giant corporation, we are a small team of researchers, and technology visionaries that built a collaborative tool for our own research needs, and are trying to share this tool with the world at large. This is a giant tragedy, and we accept responsibility for this event, and are offering our staff of methodologist to assist in recreating lost data, coding, analysis, etc. If you are affected by this data loss, please call us or shoot an email to support@dedoose.com. We did have a multiple backup strategy in place using proprietary software. It has been very challenging finding software to handle this job at the scales of data we are working with. Unfortunately the software we chose to use corrupted occasional backups. In our testing we did not encounter an issue, but during the event the full backups needed to restore were unrecoverable. This situation is complicated by the many layers of encryption our systems use, the database encryption, the backup encryption, the transfer encryption, etc. We are now running mirrors upon mirrors, off-site mirrors on a seperate cloud, off-site mirrors locally, and our team developed a tool that is automatically copying the encrypted database backups to amazons s3 cloud, glacier storage, as well as an onsite copy. We will be adding additional measure to ensure the app automatically downloads a local backup of the data needed to restore a project whenever logged in. When we developed Dedoose, we developed it initially for our research team. The biggest factor affecting our teams was the inability to collaborate on projects, thus we designed an online tool to do so. We continue to work on a non-collaborative offline version, but that was never our main focus. We recognize the seriousness of this event, but urge you to understand we have put the protections in place to ensure a data loss event is not possible in the future short of a global cataclysmic event. Ultimately it was our fault for not having backups of backups, backups of those backups, and more backups of those backups. We have resolved that issue and this will not happen again. We are deeply, deeply, sorry and more than willing to assist affected users with our team of Dedoosers.
Laurianne
100%
0%
Laurianne,
User Rank: Author
5/15/2014 | 5:31:19 PM
Thanks
Thanks to Dedoose for weighing in here with your side of the backup story.
Charlie Babcock
50%
50%
Charlie Babcock,
User Rank: Author
5/15/2014 | 9:03:04 PM
Dedoose takes full responsibility
I think Dedoose is taking full responsibilty here and appropriate corrective actions. The lost data is likely to damage the projects of of some of its customers and that will be upsetting to them, as well it should be. I hope customers understand the company is going through a tough phase and responding appropriately to future threats to continuous operations.
Page 1 / 2   >   >>
Google in the Enterprise Survey
Google in the Enterprise Survey
There's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity ­products, and 69 percent cite Google Apps' good or excellent ­mobility. But progress could still stall: 59 percent of nonusers ­distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Must Reads Oct. 21, 2014
InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A roundup of the top stories and trends on InformationWeek.com
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.