Google wants to see storage technology evolve to meet the demands of cloud computing by adopting designs that are more affordable, more error prone, and better suited to collective operation.
In a research paper published Tuesday, Feb. 23, for the 2016 USENIX conference File and Storage Technologies (FAST 2016), Eric Brewer, VP of infrastructure at Google, calls for industry and academia to work together to adapt hard disk drives to current data center needs.
The problem, Brewer explains in a blog post, is that current "nearline enterprise" disks -- spinning hard disk drives as opposed to more expensive solid state drives -- are designed for traditional servers rather than large scale data centers supporting cloud computing. This is the fastest growing market, says Brewer, and it's likely to represent the majority of the market in the near future.
YouTube demonstrates why Google sees the need for change. YouTube users, according to Brewer, upload more than 400 hours of video every minute. With video requiring about 1 gigabyte of storage space per hour, that translates to more than a petabyte (1 million gigabytes) of new storage every day. And Brewer expects the rate of video ingestion to grow 10x every five years.
How then to accommodate our insatiable hunger for cat videos? Brewer argues that hard disks should be optimized to function as collections of disks rather than discrete devices associated with a single server. "This shift has a range of interesting consequences, including the counter-intuitive goal of having disks that are actually a little more likely to lose data, as we already have to have that data somewhere else anyway," Brewer says.
[ Read how Spotify rocks on Google's cloud. ]
Such hardware wouldn't actually lose data. It would be designed to facilitate the management disk errors at a higher level, through error correction and replication across multiple, coordinated disks. The result would be more affordable data center storage hardware. Individually, the disks would be more prone to errors. But collectively, they'd keep data secure while allowing improvements in capacity and performance.
The research paper, Disks for Data Centers, co-authored by Brewer, Lawrence Ying, Lawrence Greenfield, Robert Cypher, and Theodore T'so, also asserts that security must be improved as new use cases for storage are considered. It argues that security must be hardened to prevent unauthorized firmware changes and that encryption -- currently the cause of a major legal battle between Apple and the US government -- must be adapted to collections of disks through support for multiple keys. This would make it easier to secure data from different customers in shared disk space.