Software // Information Management
Commentary
8/29/2008
09:01 AM
Neil Raden
Neil Raden
Commentary
50%
50%

MapReduce: And You Were There

There's been a lot of buzz lately about Google's MapReduce framework for speeding up the processing of large datasets. It makes you wonder, did Google just dream this up in last couple years while all of the database vendors were sleeping? Or, paraphrasing Isaac Newton, were they standing on the shoulders of giants? The answer is, both.

There's been a lot of buzz lately about Google's MapReduce framework for speeding up the processing of large datasets. It makes you wonder, did Google just dream this up in last couple years while all of the database vendors were sleeping? Or, paraphrasing Isaac Newton, were they standing on the shoulders of giants?

The answer is, both.MapReduce is a programming framework, not a language per se. It is built on an old (40+ years) programming paradigm called functional programming (just for the record, the other type of paradigm is called imperative programming and includes common languages like C# and Java). Maybe I shouldn't have said old, because my first programming language was an early functional language, APL. I was a casualty actuary and APL was perfect for doing the kinds of mathematical manipulations we needed to do, such as matrix inversion in one keystroke, recursion and manipulating n-dimensional structures with composite functions. We used to drive IT nuts. Functional languages operate on, obviously, mathematical functions and some well-known functional languages today include the successor to APL, K and the statistical language R.

The separation of functional and imperative languages is pretty leaky these days as lots of functional programming ideas have seeped into other languages. In particular, the concepts of map and reduce are widely implemented. So why, then, does it matter what you use?

The symbolic language and its syntax, rules and scope have a lot to do with what programmers can achieve and how easily they can do it, but computers don't execute symbolic code, it has to be turned into instructions that a computer (or a whole bunch of computers) can understand. If every language just gets reduced to this level, you might wonder what the difference is. The real advantage is in the compiler. In a functional language, the map function, for example, when used in composition (putting functions together) can eliminate a second, expensive map by understanding them together at compilation. The compiler designer, working from a purely functional position, can develop compilations that really leverage the symbolic language.

And this is where Google has had breakthroughs. They had to approach this problem as a fundamental aspect of doing business and developed some creative ways to really power through sets of data, but they didn't do it alone. Computer scientists have been advancing these ideas for decades.There's been a lot of buzz lately about Google's MapReduce framework for speeding up the processing of large datasets. It makes you wonder, did Google just dream this up in last couple years while all of the database vendors were sleeping? Or, paraphrasing Isaac Newton, were they standing on the shoulders of giants? The answer is, both.

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Nov. 10, 2014
Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.