AI, Public Data Sets, Real-Time: Strata + Hadoop Keynote Sampling

Strata + Hadoop keynotes included updates on the state of AI, new public data sets and programs from the US Department of Commerce, a closer look at what real-time data means for big data, and more. Here's a sampling of some of our favorite keynotes from this week's event.

Jessica Davis, Senior Editor

April 1, 2016

6 Min Read
<p align="left">Comedian Paula Poundstone performed at Strata + Hadoop after the keynotes.</p>

8 Hot Software Skills To Keep Your Career On Track

8 Hot Software Skills To Keep Your Career On Track

8 Hot Software Skills To Keep Your Career On Track (Click image for larger view and slideshow.)

SAN JOSE -- Probably the longest presentation on the main stage at Strata + Hadoop World came from comedian Paula Poundstone, who thought Cloudera should print on its business cards that it's not an actual, literal cloud, because that's what the general public thinks it is.

Poundstone's performance was the longest because all the keynote addresses at this conference are short -- five to ten minutes. The keynotes provided the audience with a series of even-shorter-than-Ted Talks that gave a sampling of insights from many different companies and presenters -- like a tasting menu for big data projects, products, and advances.

Here's a quick rundown of some of the talks and presentations from Strata + Hadoop this week.


Jack Norris, senior vice president of data and applications at Hadoop distribution company MapR highlighted the trend on everyone's minds during this edition of Strata + Hadoop -- real-time data and streaming data.

"We are entering a new era," Norris said during his quick keynote address, and that new era is driven by real-time data. That can mean continuous data ingest rather than batch processing. It can mean super fast query response time.

"Real-time is required everywhere from the time the data is collected to when the business action happens," he said. Enterprises are embracing this now. For instance, real-time data is enabling Global Semiconductor to improve yield management optimization in its manufacturing process.

National Oilwell Varco (NOV), a $23 billion multinational company that provides equipment and components to the oil and gas industry, is leveraging real-time data. American Express is relying on real-time data to protect against fraud. Norris said that the winners in the data race won't necessarily be the companies with the most data, but will be the companies that can process this data in real-time.

Learn to integrate the cloud into legacy systems and new initiatives. Attend the Cloud Connect Track at Interop Las Vegas, May 2-6. Register now!


Ian Andrews, vice president of products at Pivotal, used his micro keynote to ask this question: Have we reached the peak for BI? Is it still possible to gain more insights, or have we extracted much of the value already out of business intelligence?

He said that while we may have reached the peak of BI, we are moving into an era where the insights we deliver aren't just in BI reports. These insights are also being built directly into software. For instance, there's a famous car service company that uses data within its app to tell you when the car you ordered will arrive at your front door. That's the kind of work Pivotal is doing, he said.

US Department of Commerce

Everyone relies on government data every day. That was the message of US Commerce Deputy Secretary Bruce Andrews who highlighted some of the government data you are already using every day without realizing it.

For instance, the clock on your phone sets the time according to data provided by the National Institute of Standards and Technologies. You get weather information from the National Weather Service. Your search to find the nearest Starbucks was likely enabled by data from the US Census Bureau.

Andrews spoke about how the US Department of Commerce, which includes about a dozen different agencies, is opening up new data sets to serve the nation's analysts, data scientists, and statisticians.

"We are America's data agency," Andrews said. "Our data on climate reaches from the depths of the ocean to the surface of the sun."

But it's about more than just making the data public. "We must make it as accessible as possible," Andrews said. "We want to unleash our data so you can use it."

Andrews noted that his department was the first in the government to hire a chief data scientist, whose role is to "supercharge our data projects to improve how people interact with our department."

As part of the data initiatives, Andrews has led the creation of a Commerce Data Advisory Council, which includes experts from the private sector to help the department in its data efforts and provide "a vision for the future in which the government keeps pace with the speed of business." Council members include people from Palantir and Amazon Web Services.

The Commerce Department has also set up something called the Commerce Data Service, a group that is building products to open more data to more people. A few brand-new initiatives out of this group include a new housing affordability calculator in the residential real estate app Zillow, an online mapping program to help people use NAOO data called MapBox, and the Earth Genome, which provides topology data for industrial real estate developers.

Much of this has been added to GitHub by Commerce.

"We are still in the early days of this project, and we want more people to share how they are using commerce data," Andrews said, and asked that organizations who want to contribute go to usability or fork them on GitHub.

Artificial Intelligence

Jana Eggers, CEO of Nara Logics, was among the speakers talking about AI.

Her company is a neuroscience-based artificial intelligence company focused on turning big data into smart actions. But she pointed out that AI is still in its childhood. For instance, she noted that Roomba, the really smart robot vacuum cleaner that can dock itself, tell you if it is having a problem, and plan the best way to clean the dust from your floor, still can't detect and identify a substance that most people don't want Roomba to touch, and then track around their rooms -- poop.  It can be a messy problem.

Similarly, Microsoft's AI Tay recently experienced a big problem on Twitter, learning from Internet trolls and then sending out racist tweets that made big news. Eggers pointed out that you wouldn't teach your children about the world in this way.

"How do we not put Tay in a bar when she is 3 years old and have her pick up some of these bad habits," Eggers asked.

Beyond the sensational headlines, AI development is actually at a bit of a trough right now, according to Eggers.

What we are doing now with AI is "incremental improvements, not radical improvements," she said. "To get radical improvements, we need new thinking. We need people who are trying to understand the product we are trying to produce." We need different minds working on these problems, like entrepreneurs, suits, grandmas, and kids, she said.

About the Author(s)

Jessica Davis

Senior Editor

Jessica Davis is a Senior Editor at InformationWeek. She covers enterprise IT leadership, careers, artificial intelligence, data and analytics, and enterprise software. She has spent a career covering the intersection of business and technology. Follow her on twitter: @jessicadavis.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights