Software vendor Attunity is expanding its role of real-time data integrator for enterprise IT shops to build a road for customers between big data and the cloud. In the process, it is highlighting a series of infrastructure services designed specifically to support cloud-bound big data.
A new feature in the company's Attunity Cloudbeam SaaS service is intended to create a method simpler than FTP for IT managers to copy large data sets from the data center to the cloud. In this case, the destination is Amazon Web Services' S3 storage.
Attunity Cloudbeam first copies and synchronizes large data sets within a corporation to storage space within Amazon's S3. Once the data is inside the cloud, Attunity's software replicates it to other servers within the Amazon network in order to create secure backups that are physically separated from the master copy and to improve performance by locating data closer to end users. Within an end-user company, Attunity's service uses a range of data management, replication, and integration software to assemble coherent sets of big data.
[ Read about Splunk Storm, Splunk's new cloud-based application monitoring service. See Splunk Adds Cloud-Based Systems Monitoring Option. ]
To move data to the cloud, Attunity relies on the managed file-transfer software from RepliWeb, which Attunity acquired last fall.
Managed file transfer (MFT) is designed to provide secure, verifiable transfers of large volumes of data. While it often uses the same protocol as FTP to move files, MFT adds performance metrics, accelerators, security, auditing, and real-time reporting on the progress of a data transfer. Basic FTP offers only the ability to transfer files, not secure them in motion or audit security and transport.
According to an April report from Gartner, many large companies are moving to replace their FTP-based file transfer routines to MFT or other managed approaches. A similar report published by Forrester last July made the case that MFT is a better fit with financial regulatory and compliance rules than FTP because with MFT files pass through a server that controls, monitors, and reports on the process rather than simply moving files from client to server .
Special, higher-tech approaches to file management are necessary in the era of cloud and big data because the volumes of data to be moved are greater than ever, and the data transferred into the cloud is expected to remain not only secure but under the control of IT, according to Forrester's report. Using a cloud-based service as a platform for big data transfer and replication makes sense not only because of the volume of data involved, but also because it allows for rapid changes in both volume and processing for real-time or on-the-fly analysis, and it can reduce the capital cost of big-data projects, according to Shmuel Kliger, CTO of virtualization management vendor VMTurbo.
Tools that monitor and manage virtual infrastructures are creating problems of their own by collecting reams of data without having first created the tools to manage them, Kliger said. As a result, some virtualization vendors are investing in big data systems not to analyze customer data, but to sift through otherwise unmanageable piles of performance data.
Moving data to the cloud doesn't address that issue, but adding a data gateway with an intelligent managed file transfer function does provide a single source to gather, store, and eventually analyze the performance and security of data pushed into the cloud.
Without a centralized monitoring function, it is almost impossible to keep tabs on the progress of individual uploads, let alone the constant stream of new transactions, website customer activity records, analytic updates, metadata, or other information a big data project generates, according to Shalini Das, research director of the CIO Executive Board consultancy in Washington, D.C., who conducted numerous surveys and interviews with CIOs. "Transaction data can be very high-volume, but a lot of companies use [instant messaging], Tweets, and other collaborative platforms, so the unstructured data can be coming from many sources and be updated constantly," she said.
The cloud is not strictly necessary for storage or analysis of big data, but it does make access to both data and analysis tools much simpler for both IT and end users, according to Das.
Many companies use cloud services like Amazon's only to host their own applications, or to store data so that it's available to others in the company in secure, highly reliable data centers. "IT can no longer do process automation and think it's keeping up," Das said. "Every process that can be automated has already been automated. What IT needs to do is provide access to information, which means the data and the analytics to understand it. The more accessible information is to employees, the more useful it is and the more benefit the company can get out of it."
New innovative products may be a better fit for today's enterprise storage than monolithic systems. Also in the new, all-digital Storage Innovation issue of InformationWeek: Compliance in the cloud era. (Free with registration.)