Topics:
Open Source
The Fourth Paradigm: All About The Data
On Monday, the New York Times took a page out of their Science section to talk about Dr. Jim Gray, a software engineer and researcher and former Microsoft researcher. Two years ago, he vanished off the coast of California in his yacht and was presumed dead, but left behind a major body of work in the field of mass information analysis. His point of view was that scientists in general, not just computer scientists, are best served by creating systems that are designed from the inside out to process and help people visualize massive amounts of data efficiently. We live in a world where terabytes, petabytes and now exabytes of data are routinely generated . Not just by scientific research, but that's one of the best fields for applying these insights, and certainly one of the most fruitful. Gray's work is about to get a major boost in the form of a collection of essays that expand greatly on his insights and premises. The book is named The Fourth Paradigm: Data-Intensive Scientific Discovery, and it's free. Not just in the sense that you can download the full PDF-format text of the book at the link, but free in its licensing. It's been published under the Creative Commons license, which allows people to re-use it in any number of contexts as long as they provide proper citation for the original work. I can see this being used in any number of statistical or computer-science courses because of that. Who's going to say no to an insightful book about a timely subject that doesn't cost a dime to use, and is packed with things that would be more than worth paying for in a print edition? The book's main focus is how we get and deal with volumes of data that form the axis upon which crucial research revolves:
Good scientific work requires that your results be reproducible. At least part of that is the sharing of data: if your number-crunching is suspect, other people can take your raw data and attempt to achieve the same results. Gray's work -- and those of his colleagues -- point towards a future where scientific work that involves such data sets is not only more automatic, but far more of an open process than it is now. This isn't a matter of convenience, either. Our future survival as a species may depend on it. I'd wager it already does. The more of it that's done out in the open, and the more tools we have to enhance that process, the better. Our "A New IT Manifesto" report looks at a variety of new approaches and technologies that let IT rebels take on a whole new role, enhancing their companies' competitiveness and engaging their entire organizations more intimately with customers. Download the report here (registration required). Twitter: Me | InformationWeek « Android Round-Up: 20k Apps, Facebook Update, OS2.1 Ported | Main | Google's Chrome Now The No. 3 Browser » |
| Sign Up Now For InformationWeek News Alerts |