Multi-structured Redundancy

One-size-fits-all solutions have not worked well in storage systems. This is true in the enterprise where noSQL, Map-Reduce and column-stores have added value to traditional database workloads. This is also true outside the enterprise. A recent paper [7] illustrated that even the single-desktop store is a rich mixture of file systems, databases and key-value stores. Yet, in research one-size-fits-all solutions are always tempting and point-optimizations emerge, with the current theme du jour being key-value stores [8]. Workloads naturally change their requirements over time (e.g., from update-intensive to query-intensive). This paper proposes research around a multistructured storage architecture. Such architecture is composed of many lightweight data structures such as BTrees, key-value stores, graph stores and chunk stores. The call for modular storage and systems is not dissimilar to the Ex-okernel [4] or Anvil [10] approaches. The key difference that this paper argues about is that we want these data structures to co-exist in the same system. The system should then automatically use the right one at the right workload phase. To enable this technically, we propose to leverage the existing N-way redundancy in the data center and have each of N replicas embody a different data structure.