Hastening Write Operations on Read-Optimized Out-of-Core Column-Store Databases Utilizing Timestamped Binary Association Tables

The purpose of this thesis is to extend previous research on Out-of-Core column-store databases. Following use of the Asynchronous Out-of-Core update, which kept track of data using timestamps, an appendix is created which holds the newest timestamps and updated data by appending entries to the tables as new tuples. The appendix is naturally unsorted and unindexed by nature, causing need for a linear search that is not only slow, but causes ever-increasing query time as the volume of data within the appendix expands. Although measures exist to merge the appendix with the original body of the data, which is sorted and indexed, it only makes searching on the data swifter once the merging of tuples is complete. For this reason, the use of an offset B-Tree index to allow for more efficient searches on the appendix is proposed.

[1]  Edward J. McCluskey,et al.  A reliable LZ data compressor on reconfigurable coprocessors , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).

[2]  Mikhail J. Atallah,et al.  Authentication of LZ-77 compressed data , 2003, SAC '03.

[3]  Hasso Plattner,et al.  A common database approach for OLTP and OLAP using an in-memory column database , 2009, SIGMOD Conference.

[4]  Julian Shun,et al.  Practical Parallel Lempel-Ziv Factorization , 2013, 2013 Data Compression Conference.

[5]  Goetz Graefe,et al.  Data compression and database performance , 1991, [Proceedings] 1991 Symposium on Applied Computing.

[6]  Fred Douglis On the role of compression in distributed systems , 1992, EW 5.

[7]  Ibrahiem M. M. El,et al.  Comparative Study Between Various Algorithms of Data Compression Techniques , 2007 .

[8]  Shmuel Tomi Klein,et al.  Parallel Lempel Ziv coding , 2001, Discret. Appl. Math..

[9]  Xin Jin,et al.  Join Directly on Heavy-Weight Compressed Data in Column-Oriented Database , 2010, WAIM.