The Weather Company builds a new forecasting platform using Basho's Riak NoSQL database and Amazon Web Services.
5 Big Wishes For Big Data Deployments
(Click image for larger view and for slideshow.)
"Weather is the original big data application," says Bryson Koehler, executive VP and CIO at the Weather Company. "When mainframes first came about, one of the first applications was a weather forecasting model."
Flash forward to today and the Weather Company ingests some 20 terabytes of data per day to spin out what Keohler bills as the world's most accurate forecasts. To stay ahead of its competition, the Weather Company is in the process of rolling out a new platform built on Basho's Riak NoSQL database and running globally in the Amazon Web Services (AWS) cloud. More than a year in the making, the new platform will bring the company's scale of analysis to a whole new level.
The Weather Company is the parent behind popular brands including the Weather Channel, WeatherFX, Weather Underground, and Intellicast. It serves hundreds of thousands of customers, including 30 airlines, emergency services, shippers, utilities, insurers, media giants, and the developers behind thousands of mobile weather apps. The demand adds up to billions of computer-based data requests per day. Performance expectations are as fast as 10-millisecond latency.
The Weather Company's incumbent platform is more like a loose-knit collection of aging applications running across 13 data centers. Mainframes aren't on the list, but Koeler said the company uses a "one-of-everything" mix of databases, including MySQL, Microsoft SQL Server, Cassandra, MongoDB, and PostgreSQL. This generation of technology, which runs mostly on MySQL, captures 2.2 million current-weather-condition data points from around the globe four times per hour. The company's new consolidated platform, called SUN (Storage Utility Network), will capture 2.25 billion (with a "b") weather data points 15 times per hour.
"As with any large-scale, algorithmic-type modeling, the more data you have, the better the predictions will be," Koehler said, explaining the planned exponential increase in data capacity.
It's a big win for NoSQL technology and, in particular, Basho, which provides Riak Enterprise edition Multi-Datacenter Replication for ultra-high availability. NoSQL won out over relational databases primarily for its scalability, but Riak was chosen for its simplicity and ease of administration at high scale. It won out over Apache Cassandra, the runner-up choice, as well as MongoDB and Hadoop, which also got serious consideration, according to Koehler.
"When you're globally distributing massive amounts of data across Amazon nodes or Google Compute nodes, you want something that's simple to use and configure," said Koehler. "Cassandra, for example, is great at distributing data, but it's complicated and complex to run. Riak was built to handle massive data movement, replication, and data-synchronization on a cloud-based, globally distributed data platform."
The Weather Channel's new platform is also a win for Amazon, which highlighted the Weather Channel story at its Invent event in Las Vegas earlier this month. The SUN system will be deployed across four AWS availability zones: US East, US West, Europe, and Asia.
"We wanted to get away from our 13 data centers and move everything to an infrastructure-as-a-service model," Koehler said.
Amazon is the Weather Company's primary cloud provider, but the firm is also planning to add cloud capacity from Google and other providers.
"Competition in the compute space is important, so we're ensuring that we abstract ourselves from being stuck on any one platform," Koehler said.
The SUN system made its debut in August, powering WeatherFX, the company's advertising targeting engine, which matches ads with weather-driven demand patterns and forecasts. The platform was also rolled out in beta stage for the company's Forecast on-Demand platform. The Weather Underground site will move to the platform by Thanksgiving, according to Koehler, and The Weather Channel is expected to roll out between now and the first quarter of next year.
With the new platform in place, Koehler said, the Weather Company has divided the globe -- land, sea, ice caps, and all -- into more than 30,000 four-square-mile squares. It can accurately report current conditions and offer predicted weather for each square hours, days, or even weeks in advance.
IT groups need data analytics software that's visual and accessible. Vendors are getting the message. Also in the State Of Analytics issue of InformationWeek: SAP CEO envisions a younger, greener, cloudier company. (Free registration required.)