Bad Winter Weather Meets Big Data Prediction - InformationWeek
Data Management // Software Platforms
11:10 AM
Doug Henschen
Doug Henschen
Connect Directly

Bad Winter Weather Meets Big Data Prediction

The Weather Company is moving to a NoSQL-powered platform to gather some 20 terabytes of weather data per day. What's the biggest challenge?

When this winter's ice storms, artic deep freezes, and nasty nor'easters hit, it provided a good time to hunker down and get some inside projects done. Weather Company CIO Bryson Koehler has been working on a big one: consolidating 13 datacenters down to four, relying extensively on public cloud providers, and moving to a NoSQL-powered big-data platform.

When we last spoke to Koehler his company was preparing to move its Weather Underground business onto the new big-data platform, which runs the Riak database on Amazon public cloud computing resources, with backup resources on the Google Compute Cloud. Next up, plans called for the flagship Weather Channel to move to that same platform within a matter of weeks. Koehler's team has learned some key lessons along the way, particularly about the challenge of predicting costs when using external cloud services.

Despite horrible winter weather that kept everybody at The Weather Company more than a little busy, its SUN (Storage Utility Network) project is on track, according to Koehler. SUN captures some 2.25 billion (with a "b") weather data points 15 times every hour, up from 2.2 million (with an "m") data points four times per hour on the company's legacy on-premises platform. All that new data -- some 20 terabytes per day -- supports more accurate weather prediction around the globe.

[Want to learn more on Koehler's big-data plans? Hear him at the InformationWeek Conference, March 31-April 1.]

The Weather Company expects SUN to help it consolidate 13 datacenters down to four. The four remaining datacenters will be focused mainly on the company's broadcast infrastructure, which can't be moved to the cloud. The datacenter count is already down to eight, and it will go to seven by April. There have been challenges with the new platform, Koehler admits, with predictable costs being at the top of the list.

"We have to make sure that we engineer [the system] so we understand the exact cost per transaction," Koehler explains. By year's end the company expects to handle more than 15 billion transactions per day on the platform, "so every 100th of a penny starts to add up." Those transactions are mostly web- and mobile-app service calls against the company's hundreds of APIs.

The cost levers include choices as to which type of public cloud service the company uses for a particular computing task, how much data it caches, and, in the vein of big-data analysis, how frequently it refreshes data and triggers new forecasts. SUN now offers a rich trove of data no matter where in the world forecasts are needed, but The Weather Company must decide how often to update the data and tap computing power to generate new forecasts. In a stable, highly predictable climate period in a city like Phoenix, for example, the demands are quite different than they are when, say, a cold snap is hitting the Midwest or a nor'easter is bearing down on tens of million of people from Washington, D.C., to Boston.

The Weather Company's story is at the cross hairs of big data and cloud computing, but Koehler -- who admits he likes being edgy with technology and "turning it up to 11" -- says the challenges are often about the fundamentals of networking and ensuring that failovers occur seamlessly.

"We're still on the learning curve on how to best tune the system, how we monitor, and how we respond when things go wrong."

Engage with Oracle president Mark Hurd, NFL CIO Michelle McKenna-Doyle, General Motors CIO Randy Mott, Box founder Aaron Levie, UPMC CIO Dan Drawbaugh, GE Power CIO Jim Fowler, and other leaders of the Digital Business movement at the InformationWeek Conference and Elite 100 Awards Ceremony, to be held in conjunction with Interop in Las Vegas, March 31 to April 1, 2014. See the full agenda here.

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
3/13/2014 | 11:25:12 AM
Re: Don't!
I'm not certain the Weather Company is throwing detail away, but it's not likely it's keeping everything given that the 20-terabyte-a-day capture is within an operational system running on a NoSQL database. This granular data feeds near-term forecasting. I suspect they require less detail for historical trend analysis.

This is another good question I'm going to ask Bryson Koehler during our Big Data panel at the March 31-April 1 InformationWeek Conference.
User Rank: Apprentice
3/11/2014 | 1:35:05 PM
Why throw away historical data if we want to predict the future weather (this is all the forecast about)?
Charlie Babcock
Charlie Babcock,
User Rank: Author
3/10/2014 | 4:47:33 PM
A treasure trove of weather data
Weather Underground is collecting 20 TBs a day but doesn't need to save it all. At the same time, what a contribution to weather history and understanding worldwide patterns if it did. Granted, Weather Underground is not a non-profit, and doing so might convert it into one. But it's the first time in history we've had that much information in hand, so much we don't know what to do with it -- other than grab short term results and throw the rest away, so to speak.
D. Henschen
D. Henschen,
User Rank: Author
3/10/2014 | 2:13:52 PM
Re: Long-term outlook
A key metric for the new system is "time to live" (TTL -- as in the time the data has left to live -- not "live" as in, on air). As I understand it, they don't need all that detail forever. The fine-grained detail is for accurate forecasting NOW or today. Once the weather is history, far less data is needed to retain a historical record, so they can delete information they don't need. If you want to know more you can ask Bryson Koehler directly at the upcoming (March 31-April 1) InformationWeek Conference. He's one of two guests on our When Big Data Platforms Make Sense (And When They Don't) panel session.

Lorna Garey
Lorna Garey,
User Rank: Author
3/10/2014 | 2:05:05 PM
Long-term outlook
I'm sure that closing half its data centers will make the cloud vs. in-house TCO balance sheet look pretty darn good for a few years. However, at that data volume, has Koehler done any projections out five or 10 years? 20 TB a day adds up, after all. And cloud providers thus far have not passed Moore's Law savings back to customers.
D. Henschen
D. Henschen,
User Rank: Author
3/10/2014 | 11:44:06 AM
More data brings more accurate predictions
We'll hear more from Koehler on the predictive part of this implementation at the March 31-April InformationWeek Conference in Las Vegas. The SUN platform was an essential staring point because it provides so much more data than was previously available to drive accurate forecasts. More data brings greater accuracy, so stepping up from millions of data points four times per hour to billions of data points 15 times per hour is making a big difference, according to the company.
Register for InformationWeek Newsletters
White Papers
Current Issue
Cybersecurity Strategies for the Digital Era
At its core, digital business relies on strong security practices. In addition, leveraging security intelligence and integrating security with operations and developer teams can help organizations push the boundaries of innovation.
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll