Investigating Storage Layouts to Improve Parallelism in Interactive Brute-Force Search
Click here to download now
Overview: This research paper describes different storage layout approaches and their effect on the efficiency of data retrieval and storage in Diamond, an interactive search system. Diamond implements the concept of early discard to interactively search terabytes of unindexed data including such as images, videos, etc. Four main storage layout approaches were compared: the existing layout, which simply lays whole objects on each node, an archive approach, which stores a tar archive of data objects on each node, and two striping approaches which deal with the alignment of data striped across nodes either by repeating the incomplete data in the next stripe.

