There are any number of reasons data transfer and choice over where to locate important data stores when working with cloud computing present a challenge, the panelists said: slow and inefficient Internet links, security concerns, and architectural problems in terms of processing, just to name a few.
The TCP/IP protocol, a core element of network traffic, has long been seen as inefficient, and has forced many companies to buy technology like WAN optimization equipment to mitigate some of the problems associated with it.
"It surprises people that when you have a 1-Gb connection on both ends, it doesn't exactly work well to transfer data over that link, and that's because of the problems and inefficiencies with things like TCP/IP," said Internet Research Group analyst Peter Christy. "It's a much greater problem than most people think until people run into it."
That's just the beginning, though. Just moving the data from on premises to the cloud can present a challenge in terms of time and money, especially if a company wants to use the cloud for extra capacity for a largely on-premises, data-backed application and needs to be able to "burst" quickly. Case in point: Amazon.com on Thursday launched a service for companies to ship their data overland rather than upload it.
That said, sometimes it doesn't matter, and procuring additional servers would take longer than uploading or mailing the data to a company like Amazon. According to Omer Trajman, Vertica's director of field engineering, one of its financial services customers, for example, migrated its mortgage data to the cloud to begin new mortgage data modeling work because it didn't have three months to wait for a bunch of new servers to arrive at its data centers.
Even if companies solve the problem of latency and inefficient data links, there may be other bottlenecks lurking where companies might not expect them, such as at the cloud nodes themselves. Aspera estimates that Amazon EC2 nodes are good for 250 Mbps of throughput, meaning that companies with gigabit links are going to see performance degradation when they attempt to transfer or process data via the cloud.
Then there's the problem with what to do with large chunks of data when it's actually in the cloud. Amazon S3, for example, can't handle the largest of files, thus requiring companies to break up huge data stores into chunks. Aspera has had to help some of its customers do this with additional services, which could add significantly to the cost of moving to the cloud.
With all these problems, some of the panelists saw the potential emergence of a new class of vendors working on what Aspera co-founder and CEO Michelle Munson calls "data movement," combining such expertise as how to compress the amount of data sent over the Internet link, how to store data as efficiently as possible, and how to eliminate redundant data through things like data deduplication.
InformationWeek has published an in-depth report on cloud storage. Download the report here (registration required).