informa
/
News

GoGrid Emerges As Cloud's Big Data Specialist

GoGrid founder John Keagy says company's IaaS architecture, designed for big data, provides higher performance than other clouds.
10 Powerful Facts About Big Data
10 Powerful Facts About Big Data
(Click image for larger view and slideshow.)

GoGrid found itself in the Niche quadrant of this year's Gartner Magic Quadrant on the cloud, just like last year. And just like last year, that's exactly where it wants to be.

GoGrid started out as a general purpose infrastructure-as-a-service provider in 2001 called ServerPath, which became GoGrid several years before Amazon thought up the possibility of Amazon Web Services (generally available in 2006). Through those early years, its grid of servers matched the basic infrastructure provisioning steps of AWS.

GoGrid founder and CEO John Keagy said this week that for the past two years GoGrid has been grooming its cloud service to become a niche player that caters to the needs of mobile applications, the Internet of Things, and other big data users. "GoGrid is staking its claim to being a big data niche player," says Keagy, not a general-purpose cloud service provider.

As a cloud pioneer, Keagy has seen IaaS move from an experimental service on the network to an industry powered by massive, 300,000-square-foot data centers pursuing huge economies of scale, combined with a steady fall in infrastructure service prices.

[Want to learn more about big data? See Big Data Learns To Write.]

"We've seen enough of trying to be all things to all people. It's a lot more fun when you know who you are and who you're trying to serve," Keagy says in an interview. "If you want to run a high-performance, distributed database across multiple data centers on a 10G network, you can do that only on GoGrid and nowhere else," he said.

GoGrid operates three data centers: 25,000 square feet in San Francisco; approximately 10,000 square feet in Ashburn, Va.; and about 5,000 square feet in Amsterdam. Each data center has a 10 Gbit/s Ethernet network fabric, with 10 Gbit/s connections between data centers.

GoGrid will offer specialized virtual machines equipped with high-speed I/O and large amounts of RAM, solid state disks, and spinning disks. At GoGrid, the public cloud looks more like a customized private cloud. There'll be no multitenant servers. A big data customer will provision an Intel Ivy Bridge-based server, or group of servers, and be its only user.

GoGrid customers can choose a Xen virtual machine design that's a fit for their big-data problems, he explains. A high-input/output operations server makes use of solid state disks attached to the motherboard of the server and the 10 Gbit/s Ethernet for movement of data to disk. A RAM-intensive virtual machine with 256 GB is designed for in-memory databases and analytics applications.

An unusual option will be the raw disk server designed for the HBase NoSQL system. HBase is designed to handle large amounts of data through its ability to talk directly to the spindles of spinning disks. A raw disk server can combine CPU power with a concentration of up to 45 four-terabyte disks for a total of 180 TB.

"This is a completely different architecture" from the typical, accelerated I/Os between servers and block storage, like that provided by Amazon with Elastic Block Store, Keagy says. The 45-TB drives occupy a 4u space on the server rack close to the servers. They are directly attached to server motherboards via a serial-attached SCSI connection, also known as SAS, which lets data flow across parallel channels concurrently to multiple spinning disks. The 180-TB raw disk server includes up to 240 GB of RAM and up to 40 Ivy Bridge cores.

These designs are meant to capitalize on the data-handling characteristics of such systems Cassandra, MongoDB, HBase, and Hadoop. Cassandra was "stress tested" on three GoGrid nodes in data-handling functions that required large and frequent writes of data. It sustained 700,000 transactions per second, says Heather McKelvey, CTO and senior VP of engineering at GoGrid, in an interview. DataStax, the company behind open source Cassandra, is partnering with GoGrid to make its service optimized for Cassandra. It conducted the stress test.

Likewise, MongoDB can be optimized on a GoGrid virtual machine to do computation-heavy MapReduce jobs, McKelvey said. Part of the GoGrid approach is to get rid of network attached storage and SANs, and to let CPUs, RAM, solid state disks, and spinning disks work in tight proximity.

GoGrid is making it easy to get started in big data or build and test sample systems with a "one-button" deployment option for Cassandra, MongoDB, Hadoop, HBase, and Riak. For the longer term, Keagy thinks organizations needing to continuously collect and analyze big data will bring their custom production systems based on these NoSQL systems to the GoGrid cloud and route their data-collection functions there.

That would mean GoGrid has left plain vanilla infrastructure behind for a more specialized cloud service offering. "We love the gravity of big-data solutions," he notes. That means organizations that find they can effectively run their big data systems on GoGrid are also likely to collect and store their data there and perform their analytics there.

Keagy gave an Internet of things type example. If a healthcare provider has convinced thousands of customers to report in thousands of metrics a day on how they're functioning in various activities, it will need monitoring and analytics systems to ensure that important data is detected and analyze in a way that meaningful to the patient. No point in detecting an impending heart attack if you have to wait for a report that comes at the end of the month, end of the week, or perhaps even the end of the day. To the patient, it's a real-time issue.

IBM, Microsoft, Oracle, and SAP are fighting to become your in-memory technology provider. Do you really need the speed? Get the digital In-Memory Databases issue of InformationWeek today.