Hadoop Creator Cutting Talks Big Data Past, Present, Future

Machine Learning & AI

Hadoop creator Doug Cutting led off the keynote addresses at the Strata + Hadoop event in San Jose on March 30, marking the 10th anniversary of the big data technology and talking about what the future may hold.

Jessica Davis, Senior Editor

March 31, 2016

4 Min Read

<p align="left">Doug Cutting</p>

Hadoop At 10: Milestones And Momentum

Hadoop At 10: Milestones And Momentum (Click image for larger view and slideshow.)

Apache Hadoop has moved from a fledgling technology championed by open source advocates to a platform that has increasingly become mainstream in enterprise IT shops over the last 10 years since it was initially created.

In a keynote address on March 30 at the Strata + Hadoop conference in San Jose, Doug Cutting, Hadoop creator and chief architect at Hadoop distributor Cloudera, provided an informal State of Hadoop and Big Data address, looking back at the last 10 years and forward to what the future may hold for big data.

"It used to be different between open source and enterprise," Cutting told attendees during an address that led off the morning keynotes -- with open source "hippies" attending O'Reilly conferences while enterprise IT focused their attention and budgets elsewhere. "Now we've seen a merger of these communities -- enterprise and hacker."

Several factors combined to bring big data and Hadoop to this moment, he said, including the inexpensive hardware driven by the PC revolution and the open source community that created standards and turned these platforms into something that people could use at a very low cost.

"We had all the ingredients to really begin this change to ignite this revolution," he said. "Hadoop was the first to combine this into a single system."

The elements of that system have essentially remained the same over the past 10 years -- the HDSF storage system, YARN scheduler, and MapReduce execution engine. But over those 10 years more technologies have been introduced to improve Hadoop, including Apache Spark, which many organizations are now using instead of MapReduce.

Create a culture where technology advances truly empower your business. Attend the Leadership Track at Interop Las Vegas, May 2-6. Register now!

"Technologies have developed around the [Hadoop] kernel," Cutting said. "And that's what will survive longer than the Hadoop project itself. A new family of technology has arrived, and a great example of that is Spark… Spark came out of the University at Berkeley. It didn't come out of a business. It came about because folks found it useful. We are seeing this again and again."

Cutting said that the Hadoop ecosystem will see this again and again. There's competition for the best technologies at the storage level and at the query level and in other areas, too. As these new technologies arrive, they will improve the whole.

"Hadoop's legacy is creating a new way of developing an ecosystem with collaboration," he said.

Today the hardware and software needed to run Hadoop is available at a much lower cost, and the system itself is much more scalable, Cutting said, with systems regularly scaling to tens of petabytes.

The technology is part of what is driving changes across all industries as they move to digital operations and customer service.

"Banks, insurance companies, manufacturers, retailers, and healthcare providers are adopting data technologies not at the periphery, but at the center of the business," Cutting said. "Data is becoming the fundamental driver of economic growth for the century."

Cutting provided a few predictions for Hadoop and big data in the next 10 years, too. Beyond the software stack, he said that he believes big data will get a boost from improvements to computer hardware. For instance, he said, Intel has created a new memory technology called 3D XPoint that improves memory speed.

"We've grown up with systems where the primary bottleneck was I/O," he said. "We are going to have the majority of data sets stored in memory, and that's going to change the applications that we can build."

Cutting also said that cloud computing has reached maturity, noting that Amazon Web Services (AWS) launched at around the same time as Hadoop and has gone through a similar adoption curve. More companies are now storing data in the cloud, he said.

"But the biggest change in the next 10 years is not going to be something I can predict, but will be things that you are involved in," he told attendees. "We now have a system that is in your hands. It is being created with your input. You can make a difference here. More so than ever before."

About the Author(s)

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

See more from Jessica Davis

Related Topics

Recent in Leadership

Related Topics

Recent in Resilience

Related Topics

Recent in ML & AI

Related Topics

Recent in Data

Related Topics

Recent in Sustainability

Related Topics

Recent in Infrastructure

Related Topics

Recent in Software

Related Topics

About the Author(s)

Editor's Choice