6 Cloud, Big Data Startups To Watch

Check out what the Structure 2013 LaunchPad winners can do with cloud, big data problems.

Charles Babcock, Editor at Large, Cloud

June 21, 2013

8 Min Read
InformationWeek logo in a gray background | InformationWeek

SaltStack is a Salt Lake City startup that beat out five other startups to win the Structure 2013 LaunchPad competition. But two companies, both committed to DevOps types of systems, emerged as winners.

A three-judge panel of venture capitalists -- Luis Robles of Sequoia Capital, Ann Winblad of Hummer Winblad and Bipul Sinha of Lightspeed Venture Partners -- selected SaltStack for its comprehensive deployment management system and its early traction and rapid buildup of customers. The Salt open-source project on which it is based is only two years old.

CEO and co-founder Marc Chenn said developer interest in Salt, a Python-based system, has propelled it into the top 10 open-source projects in the world, ranking just behind the big cloud project, OpenStack, according to a GitHub ranking in December. SaltStack itself is "cloud agnostic" and can be used on top of multiple cloud systems.

Most enterprises still manage servers through manual processes carried out by systems administrators, who at most can handle 50 servers, are forced to work 60 hour work weeks and provide "a breeding ground for Murphy's Law," said Chenn. Much of their work could be taken over by SaltStack.

[ Want to learn more about how the first day of Structure turned into a debate over chips? See Intel, AMD Debate Best Chip For Cloud. ]

"This tool has come out of nowhere," said Sinha after Chenn's short presentation at LaunchPad. It's in use at HP Cloud, Linkedin, Hulu and cars.com. It competes with older and highly regarded open-source configuration and deployment management software, such as OpsCode's Chef and Puppet Labs' Puppet. SaltStack, however, covers more of the lifecycle of the deployed code, giving operations the means to provide feedback to developer teams on how to improve the software's performance.

Supported versions of SaltStack with enterprise licensing are now available from Chenn's firm.

"SaltStack is a total package -- cloud deployment, configuration management, remote execution and monitoring in a clean, well-designed package. It has enabled us to achieve things in record time that had stymied us prior to discovering it," said Patrick Crews, software/systems engineer for the HP Cloud, on the SaltStack website.

SaltStack was the judges' winner, but it wasn't the top pick of Structure attendees. About 300 in attendance voted on the candidates after their quick presentations at the conference center of the University of California at San Francisco's new Mission Bay campus.

Their pick was Factor.io, another DevOps-type of system bringing automated workflow to application deployment. In a comment at the end of the competition, Winblad said Factor.io "was extremely interesting to me. I liked it the best." It won 28% of the audience vote.

The accomplished Web companies don't manually build virtual machines, install applications and configure a software stack for operation in a cloud environment. "Amazon Web Services had adopted sophisticated deployment practices that let them deploy a code update every 11.6 seconds," said Maciej Skierkowski, co-founder and CEO of Factor.io. The company establishes a policy-driven workflow for deploying an application into a particular environment, including deploying to Amazon Web Services, Heroku or Microsoft Azure.

"We've automated the deployment process because deployment to the cloud sucks," said Skierkowski.

AppScale won neither the judges' nor the popular vote, but may have been dismissed too quickly, given its future potential.

AppScale co-founder and CEO Woody Rollins based his presentation on AppScale's initial use case, giving Google App Engine developers the means to failover to a cloud setting outside App Engine. It's a capability they need about once a month, as App Engine operations experience an outage, he said in an interview after his presentation.

Rollins said Google had studied third-party developer practices for years before encapsulating what it perceived as best practices into its platform as a service. Developing Python or Java applications on App Engine is a quicker, surer process than in most development environments, he said. But once a developer establishes an account on App Engine, there are few places to go outside the Google infrastructure if he wants a backup and recovery location for his application.

In the long run, however, that's not the only thing AppScale will be able to do. A small team lead by CTO Chandra Krintz, on a leave of absence from teaching computer science at the University of California at Santa Barbara, has developed an open-source Python environment that has duplicated the function of App Engine APIs.

"If an application runs on App Engine, it can run on App Scale," said Krintz in an interview after LaunchPad, and many developers have tested that hypothesis without being disappointed. That opens the door for startups basing their infrastructure on App Engine to build out a compatible infrastructure as a private cloud operation as they mature and bring more operations in-house. App Engine remains a relatively young setting for enterprise workloads. It didn't come out of beta until June 2012. But the future of an App Engine-style private cloud is not yet part of the AppScale product line and Rollins made no mention of such a possibility in his presentation. He stuck to the failover use case.

"Wouldn't it be worth $1,000 a year to make sure your application never fails?" he asked.

"AppScale has done a lot to replicate Google App Engine with a very small team," noted Robles, but the failover use case failed to win over the other two judges.

Three of the entrants were focused on data management and extracting more meaning out of the big data being collected today.

Mocking the SQL query language as 1978 technology, CTO Matthias Brantner of startup 28msec said it's time for a more real time, more flexible and broader data type reading query language, such as open-source JSONiq.

The name is taken from JSON or JavaScript Object Notation, frequently used by NoSQL systems to capture and exchange large amounts of both structured and unstructured data. The NoSQL systems MongoDB and Cassandra both support use of JSON, and JSONiq brings commands such as Project, Join, Group and Filter to data management.

"We've been mostly gluing and stitching data together using 1978 technology," said Brantner, referring to the year that IBM researcher Edgar Codd first published his paper on using mathematical set theory to build SQL. IBM didn't release it as a query language product until DB2 was ready eight years later. Brantner said it's time for a new generation of data management systems to take over from relational databases. His firm's 28.io data management system with JSONiq querying became available Thursday. It can retrieve data from "any system," the firm claims on its website, and Brantner said it works with both relational and NoSQL systems.

With a freely available extension, JSONiq can retrieve XML-based data. Brantner said the 28.io system captures XML, transforms it into JSON formatting and stores it on Amazon's S3 storage service. It is available either as a subscription service from the Palo Alto company or as software to be installed on-premises.

Brantner spoke with conviction on the need to implement new data management platforms, such as 28.io, in a Web- and mobile application-based world. He and his team have been working on 28.io in stealth mode for seven years.

Brantner received a Ph.D. in information systems from the University of Mannheim in Germany. While there, he conducted research on "Rewriting Declarative Query Language" and previously worked on XQuery systems that extracted data from XML databases.

MetricaDB founder and CEO David Crawford said his firm is offering an online data-analysis service, able to draw data from SaaS applications, Google Analytics, Tumblr, Facebook, Twitter, Stripe and Zendisk, as well as NoSQL and relational database sources, and then analyze it for meaningful information.

Many businesses, having found they can collect big data from such sources, "are throwing people at the problem" of trying to find meaning in the data. MetricaDB supplies the interface to the data-collecting system, then lets the user analyze the data. "We are the analytics platform for the new data-based world," Crawford said.

Another analytics approach was supplied by Synapsify. Stephen Candelmo, co-founder and CEO, said his Washington, D.C.-based firm used a small team of six to develop a patented method of text analytics. Its approach "moves beyond simple keyword extraction" to identify the credibility of a text and the seriousness or gravitas with which it is addressing a subject.

It can deal with short comments on the Web or book-length manuscripts. It curates a given text against competing sources. It has five pilot clients using its service on AWS.

Judge Sinha commented that Synapsis had impressive technology but needed to find a use case where its capabilities became a compelling business proposition.

In the judges' voting, Factor.io, MetricaDB and 28msec all tied for second place.

About the Author

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights