Mobile // Mobile Applications
Commentary
3/2/2009
03:40 PM
Connect Directly
LinkedIn
Google+
Twitter
RSS
E-Mail
50%
50%
Repost This

Ambitious Startup Wants To Manage All Your Unstructured Data

Digital Reef swings for the data management fences by indexing and classifying all unstructured data in the enterprise. Top applications include e-discovery and storage management.

Digital Reef swings for the data management fences by indexing and classifying all unstructured data in the enterprise. Top applications include e-discovery and storage management.Digital Reef, which is having its official company launch today, has no shortage of ambition. The company aims to index and auto-classify all the unstructured data floating around on file servers, backup systems, archives, e-mail, collaboration tools, and content management systems.

The goal is to make unstructured data easier to manage when it comes to e-discovery, storage management, and compliance.

The company takes a comprehensive approach to indexing and classifying all the unstructured data in an enterprise. The company deploys software that crawls network storage systems and creates a full-content index of everything it finds, including metadata. It supports NFS and CIFS so it can mount most file stores. The company also has prebuilt connectors to get information stored in applications such as SharePoint, Exchange, and Lotus Notes.

The index is stored on a grid computing cluster. Customers can use commodity hardware for the grid. Content on the grid is stored in a flat-file format rather than in a database.

As the software indexes content, it also analyzes it with a similarity engine. This engine, which is the primary IP of the company, performs two major functions. First, it looks for duplicates or near-duplicates of files. By identifying duplicates, the index can return fewer files in an e-discovery exercise, saving time and money on document review.

The second function is auto-classification. The software looks at every piece of data in a file and suggests a classification based on the most relevant semantic ideas being expressed in the file. According to the company, the software doesn't need to be trained before it classifies and categorizes content. "We label all our folders with the top terms that placed a document into the folder, so you can understand why a document is in that folder," says Brian Giuffrida, VP of marketing and development at Digital Reef.

The index is fully searchable. While the company says general users can search the index, it's aimed more at legal and compliance managers that need to search large volumes of information for efforts such as e-discovery or to find sensitive data such as Social Security or credit card numbers that may need to be moved to a more secure location.

The company claims it can index and classify up to 4 TB worth of files every 24 hours. Management software routes jobs among servers in the grid to balance loads. If an indexing job on a particular target fails, the software can restart the job from the failure point instead of having to re-index the entire file server.

E-discovery is the killer app for Digital Reef, but the company says it also has plans to add features that will let administrators move data from one storage tier to another, set retention policies, or delete files at the end of the retention life cycle.

Digital Reef faces a host of competition, including Autonomy and Recommind. Vendors such as Guidance Software, Kazeon, StoredIQ, and Zylabs are competing to be the indexing software of choice for e-discovery efforts.

Comment  | 
Print  | 
More Insights
Building A Mobile Business Mindset
Building A Mobile Business Mindset
Among 688 respondents, 46% have deployed mobile apps, with an additional 24% planning to in the next year. Soon all apps will look like mobile apps and it's past time for those with no plans to get cracking.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.