Highly Scalable Multiprocessing Algorithms for Preference-Based Database Retrieval

Until recently algorithms continuously gained free performance improvements due to ever increasing processor speeds. Unfortunately, this development has reached its limit. Nowadays, new generations of CPUs focus on increasing the number of processing cores instead of simply increasing the performance of a single core. Thus, sequential algorithms will be excluded from future technological advances. Instead, highly scalable parallel algorithms are needed to fully tap new hardware potentials. In this paper we establish a design space for parallel algorithms in the domain of personalized database retrieval, taking skyline algorithms as a representative example. We will investigate the spectrum of base operations of different retrieval algorithms and various parallelization techniques to develop a set of highly scalable and high-performing skyline algorithms for different retrieval scenarios. Finally, we extensively evaluate these algorithms to showcase their superior characteristics.

[1]  Mohamed F. Mokbel,et al.  FlexPref: A framework for extensible preference evaluation in database systems , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[2]  James R. Larus,et al.  Spending Moore's dividend , 2009, CACM.

[3]  Jignesh M. Patel,et al.  Efficient Skyline Computation over Low-Cardinality Domains , 2007, VLDB.

[4]  Riccardo Torlone,et al.  Finding the Best when it's a Matter of Preference , 2002, SEBD.

[5]  Jan Chomicki,et al.  Preference formulas in relational queries , 2003, TODS.

[6]  Kenneth L. Clarkson,et al.  Fast linear expected-time algorithms for computing maxima and convex hulls , 1993, SODA '90.

[7]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[8]  Beng Chin Ooi,et al.  Indexing for progressive skyline computation , 2003, Data Knowl. Eng..

[9]  Timothy L. Harris,et al.  A Pragmatic Implementation of Non-blocking Linked-Lists , 2001, DISC.

[10]  Sebastiano Vigna,et al.  Pictures from Mongolia. Extracting the Top Elements from a Partially Ordered Set , 2008, Theory of Computing Systems.

[11]  Timotheus Preisinger The Hexagon Algorithm for Pareto Preference Queries , 2007 .

[12]  Norbert Zeh,et al.  Parallel Computation of Skyline Queries , 2007, 21st International Symposium on High Performance Computing Systems and Applications (HPCS'07).

[13]  Jarek Gryz,et al.  Algorithms and analyses for maximal vector computation , 2007, The VLDB Journal.

[14]  Forouzan Golshani,et al.  Proceedings of the Eighth International Conference on Data Engineering , 1992 .

[15]  Maged M. Michael,et al.  High performance dynamic lock-free hash tables and list-based sets , 2002, SPAA '02.

[16]  Jeffrey Scott Vitter,et al.  Algorithms and Data Structures for External Memory , 2008, Found. Trends Theor. Comput. Sci..

[17]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[18]  Minghe Sun,et al.  A primogenitary linked quad tree data structure and its application to discrete multiple criteria optimization , 2006, Ann. Oper. Res..

[19]  Elchanan Mossel,et al.  Sorting and Selection in Posets , 2007, SIAM J. Comput..

[20]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[21]  Jonghyun Park,et al.  Parallel Skyline Computation on Multicore Architectures , 2009, ICDE.

[22]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[23]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[24]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[25]  Rudolf Bayer,et al.  Concurrency of operations on B-trees , 1994, Acta Informatica.

[26]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[27]  Maurice Herlihy,et al.  A Lazy Concurrent List-Based Set Algorithm , 2005, OPODIS.