At Hadoop Summit this week, Hortonworks will point to increasing adoption and Hadoop's shift from trendy to mainstream technology.
10 Big Data Pros To Follow On Twitter
(Click image for larger view and slideshow.)
Perhaps Steve Ballmer's available to whip the crowd at this week's Hadoop Summit into a frenzy with an encore of his timeless "developers, developers, developers" performance, only chanting the word "momentum" instead.
Expect momentum -- lots of it -- to be a pervasive idea running throughout the three-day event, which kicks off Tuesday in San Jose, Calif., and will be streamed through a free online broadcast. From jobseekers to vendors to corporate users just dipping their toes in the big-data water, Hadoop's got the forward momentum of an Olympic bobsled team.
"The pace of Hadoop usage is certainly accelerating," Hortonworks VP of marketing Dave McJannet said in an interview. "This emergence of a common data architecture with Hadoop as a core component has landed."
Hortonworks, which organizes the Hadoop Summit alongside Yahoo, expects more than 3,000 participants at this week's event, as well as more than 80 sponsor organizations, up from around 60 at the previous summit. McJannet points to the release of YARN last fall as one of the major drivers of increasing adoption and a shift from trendy technology to mainstream technology. YARN enables more dynamic resource management within clusters beyond batch-oriented MapReduce jobs, so that organizations can run multiple applications in the same cluster. Perhaps, McJannet gives as an example, in addition to that MapReduce job you also want to do interactive query and stream processing on that data set as well.
"YARN, more than anything, [moved] these single-application clusters, which they may have been running in the [Hadoop 1.0] world, to these multi-application clusters where all of sudden they'll have five, six, seven, eight" applications running on a single cluster, says McJannet. "That's leading many more end-users to start using it, and leading more large IT [vendors] to start engaging with it."
As a result, you can likewise expect YARN to be mentioned quite a bit in tandem with that momentum. Hortonworks, for one, announced on Tuesday a "YARN Ready Program" as part of its partner certification program. The name's pretty self-explanatory: The program aims to help integrated software vendors (ISVs) and other partners get new users and their existing applications into Hadoop without tons of painful retrofitting required. This coincides with the tech preview of Apache Slider, a technology for porting applications into YARN and Hadoop without having to rewrite them from scratch.
Back to that momentum: As Hadoop continues to evolve into a platform, not just for tech companies and big-data startups, but for organizations in almost any industry, the next waves of corporate adopters will pose three key questions before making the leap, according to McJannet. First: Why Hadoop? Second: Does this integrate with the technologies I already use? Third: How do I leverage my existing IT skills?
On the "Why Hadoop?" front, McJannet noted a common path for new adoption: building a single analytics application around a particular data set, such as clickstream, server log, or machine data. Then, the organization builds additional applications from there. YARN's ability to support multiple data processing engines in a single cluster was a boon for such piece-by-piece projects, which can be more efficient and palatable for some organizations -- and less ominous than some vague "Let's do big data, now" edict handed down from the C suite.
Existing skills and legacy systems go hand-in-hand -- companies are less keen on taking the Hadoop plunge if their standing infrastructure, applications, and human resources go to waste. That's why you'll see more hardware pre-engineered to work with Hadoop, as well as partnerships between vendors like Hortonworks and IT stalwarts like Microsoft, Teradata, SAS, SAP, HP, Red Hat, and others.
"If you have a Teradata investment, if you have an Excel investment, if you have an SQL server investment, you can plug in Hadoop as a component of that architecture so people can use their existing skills," McJannet told us. Again, YARN's a driving force: You no longer need developers versed in MapReduce to use Hadoop. "As mainstream adopters start [using] Hadoop en masse, the ability to leverage skills and integrate with existing technologies are probably number one and number two in their decision criteria."
OK, so you probably won't see Ballmer dance across the Hadoop Summit stage -- he's keeping busy in retirement -- but the mantra stands: Momentum, Momentum, Momentum.
Can the trendy tech strategy of DevOps really bring peace between developers and IT operations -- and deliver faster, more reliable app creation and delivery? Also in the DevOps Challenge issue of InformationWeek: Execs charting digital business strategies can't afford to take Internet connectivity for granted.
Kevin Casey is a writer based in North Carolina who writes about technology for small and mid-size businesses. View Full Bio
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business wonít wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.