Q&A With Gartner's Don Feinberg on Database as a Service and Cloud DBs
Microsoft, IBM, Oracle and Sun are now fueling the growing fire around the database-as-a-service and cloud database markets, but what's the difference between these offerings and what's the appeal? Database guru Don Feinberg defines terms and raises important questions about reliability and security.
What's your take on the emerging terms "Database as a Service" (DBaaS) and "Cloud Databases" (Cloud DB)?
Gartner is using "Database as a Service" [for the broad category] because we do not want to associate this only with "the cloud." To draw the distinction, companies like Kognitio and 1010data will sell you a database running on their systems at their sites. They host your database for you on their DBMS. You send them the data and they put it in a database and set everything up. You run your queries against their remote service. That is DBaaS, as opposed to managed services, because they're not having you pay for the hardware and then managing it for you. They charge you by the month for usage by the terabyte.
Now, why would I want to do that? There are several possible reasons. One is that my IT department can't do it for me, so I just work around them and go out and buy the service. A second reason might be that the IT department doesn't have the bandwidth and they encouraged me to take this route. Or maybe I have a short-term project that will last only a couple of months.
How does this compare with cloud databases?
I'll get to that in a moment, but first let's talk about DBaaS offerings that run in the cloud. The difference here is that instead of the vendor running the database service for me on their site, they are going to run it in the cloud. That's what Vertica, EnterpriseDB and Sun/MySQL are doing. You call them up and say, "I want an instance of your database in the cloud." They contract with Amazon EC2, set up the instance and give me a simple link. Now I have my database in the cloud. Oracle will also let you host your license on EC2. They're going to call it Oracle in a Cloud, but you'll have to put it on the EC2 virtual machine yourself. That's a little different than what Vertica and EntepriseDB are doing because they will handle everything and you pay one vendor rather than dealing with Amazon on your own.
What's the difference between cloud compute power and what you get from a vendor with a data center in a particular location?
The key difference is that I don't know where it is. With the cloud, it could be in Bangalore, it could be in Russia or it could be in San Paulo, Brazil. Amazon won't tell you were their machines are for security reasons. You have no control over what machine your database is running on. You're buying a virtual machine — that's what the cloud is — and I don't know or care where it is.
This presents problems that I don't have if I'm using a DBaaS that's at a vendor's site. Number one, at a vendor site I can specify whether I'm using shared hardware and I have more control over who is using the same machines. Security wise, if it's at a vendor site, I'm a little bit more comfortable; people use Salesforce.com in part because they are comfortable that the data is at their site. From a scalability standpoint, with DBaaS, I'm using the vendor's hardware and infrastructure, and they can tell me exactly what they do to ensure availability, redundancy and recovery. When you're in the cloud, you may not have those assurances. EC2 went down the other day and everybody went down with it.
Widely trusted companies like IBM and Microsoft are developing cloud capacity, but it sounds like you're inherently uncomfortable with that model.
Today, yes, I'm uncomfortable with the cloud. When Salesforce.com got started with software as a service (SaaS), very few companies used them. People didn't understand it. They knew that the customer data was going to be stored someplace else. They didn't know if it was secure. They didn't know if it would scale. And they also had no idea whether it would be reliable. Over time, Salesforce.com proved their offering to be secure, scalable and reliable, and today, anybody will put their information out there, but it took them seven or eight years to build up that level of confidence.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.