A positional access method for relational databases

Most commercial database management systems sort tuples of a relation by their primary keys for the purpose of supporting efficient insertions, deletions, and updates. However, primary keys are usually auto-generated integers, which bear little useful information about user data. Secondary indexes have to be created sometimes to help retrieve tuples by columns other than the primary key. Evidently, a better solution is to sort the data by columns that appear frequently in retrieval conditions. Unfortunately, this method does not work, at least not immediately, when the relation is vertically partitioned, which is a popular technique to reduce I/O overhead, since it is difficult to keep tuples of two partitions in exactly the same order unless the sorting columns are replicated, which again wastes storage space and disk bandwidth unnecessarily. In this paper, we introduce a positional access method that allows a partition to be sorted by another one but incurs little storage overhead and provide details about how to improve its performance.

[1]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[2]  Martin L. Kersten,et al.  Self-organizing tuple reconstruction in column-stores , 2009, SIGMOD Conference.

[3]  Donald E. Knuth,et al.  The art of computer programming: sorting and searching (volume 3) , 1973 .

[4]  Goetz Graefe,et al.  Sorting And Indexing With Partitioned B-Trees , 2003, CIDR.

[5]  Shamkant B. Navathe,et al.  Vertical partitioning algorithms for database design , 1984, TODS.

[6]  Donald E. Knuth,et al.  The art of computer programming, volume 3: (2nd ed.) sorting and searching , 1998 .

[7]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[8]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[9]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[10]  Marcin Zukowski,et al.  Positional update handling in column stores , 2010, SIGMOD Conference.

[11]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[12]  Michael Stonebraker,et al.  Document processing in a relational database system , 1983, TOIS.

[13]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[14]  Betty Salzberg,et al.  On-line reorganization of sparsely-populated B+-trees , 1996, SIGMOD '96.

[15]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[16]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[17]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[18]  Lars Arge,et al.  The Buffer Tree: A New Technique for Optimal I/O-Algorithms (Extended Abstract) , 1995, WADS.

[19]  R. Bayer,et al.  Organization and maintenance of large ordered indices , 1970, SIGFIDET '70.

[20]  Peter Boncz,et al.  UvA-DARE ( Digital Academic Repository ) Monet ; a next-Generation DBMS Kernel For Query-Intensive Applications , 2007 .