The InformationWeek -- Blogs

Startup City Blog

Topics:   Content Management : Information Management : Startup City

  • Email this page E-mail this page
  • Print this page Print this page
  • Bookmark and Share
  • icon

Ambitious Startup Wants To Manage All Your Unstructured Data


Posted by Andrew Conry-Murray, Mar 2, 2009 03:40 PM

Digital Reef swings for the data management fences by indexing and classifying all unstructured data in the enterprise. Top applications include e-discovery and storage management.


Digital Reef, which is having its official company launch today, has no shortage of ambition. The company aims to index and auto-classify all the unstructured data floating around on file servers, backup systems, archives, e-mail, collaboration tools, and content management systems.

The goal is to make unstructured data easier to manage when it comes to e-discovery, storage management, and compliance.

The company takes a comprehensive approach to indexing and classifying all the unstructured data in an enterprise. The company deploys software that crawls network storage systems and creates a full-content index of everything it finds, including metadata. It supports NFS and CIFS so it can mount most file stores. The company also has prebuilt connectors to get information stored in applications such as SharePoint, Exchange, and Lotus Notes.

The index is stored on a grid computing cluster. Customers can use commodity hardware for the grid. Content on the grid is stored in a flat-file format rather than in a database.

As the software indexes content, it also analyzes it with a similarity engine. This engine, which is the primary IP of the company, performs two major functions. First, it looks for duplicates or near-duplicates of files. By identifying duplicates, the index can return fewer files in an e-discovery exercise, saving time and money on document review.

The second function is auto-classification. The software looks at every piece of data in a file and suggests a classification based on the most relevant semantic ideas being expressed in the file. According to the company, the software doesn't need to be trained before it classifies and categorizes content. "We label all our folders with the top terms that placed a document into the folder, so you can understand why a document is in that folder," says Brian Giuffrida, VP of marketing and development at Digital Reef.

The index is fully searchable. While the company says general users can search the index, it's aimed more at legal and compliance managers that need to search large volumes of information for efforts such as e-discovery or to find sensitive data such as Social Security or credit card numbers that may need to be moved to a more secure location.

The company claims it can index and classify up to 4 TB worth of files every 24 hours. Management software routes jobs among servers in the grid to balance loads. If an indexing job on a particular target fails, the software can restart the job from the failure point instead of having to re-index the entire file server.

E-discovery is the killer app for Digital Reef, but the company says it also has plans to add features that will let administrators move data from one storage tier to another, set retention policies, or delete files at the end of the retention life cycle.

Digital Reef faces a host of competition, including Autonomy and Recommind. Vendors such as Guidance Software, Kazeon, StoredIQ, and Zylabs are competing to be the indexing software of choice for e-discovery efforts.

« Fear Of Obama IT Protectionism Policy Spreads In India | Main | 'Niagara' BlackBerry Falling Toward Summer Release On Verizon? »



Sign Up Now
For InformationWeek News Alerts




This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.




 
Startup City Video

 

  1. Detecting Scalability Problems With Intel Parallel Universe Portal
  2. Just Say No To SFAQL Parallelism
  3. QuickThread: A New C++ Multicore Library


Join The InformationWeek Group On LinkedIn


                           


  1. Apple Steps Into AT&T-Verizon Ad War
  2. Apple Says Users To Blame For iPhone Virus
  3. HP Picks Worst Name Ever For New Smartphone
  4. AT&T's iPhone Stranglehold Ending June 2010?


  1. Wind River Taps Datalight For Flash Memory
  2. Microsoft Azure Supports Federated ID
  3. Global CIO: The Thanksgiving Angels Of Flight 3405
  4. 'Godfather Of Spam' Gets Four Years In Prison
  5. Senators Urge EU To Finish Oracle Sun Probe
  6. Microsoft Issues Internet Explorer Security Advisory

 

  Demo
Foundry Group
Hummer Winblad
Keene View
KillerStartups
OnStartups
Paul Graham
Pmarca
  SandHill.com
Silicon Alley Insider
Startup Camp
StartupSquad
TechCrunch
VentureBeat
Venture Hacks
Y Combinator

  DECEMBER 2008
NOVEMBER 2008
OCTOBER 2008
SEPTEMBER 2008
AUGUST 2008
JULY 2008
JUNE 2008
MAY 2008
  APRIL 2008
MARCH 2008
FEBRUARY 2008
JANUARY 2008
DECEMBER 2007
NOVEMBER 2007
OCTOBER 2007
SEPTEMBER 2007