Top-k Queries Processing with Uncertain Data on Graphics Processing Units

Considering the complex uncertain database, top-kquery processing in uncertain databases is semantically and computationally different from classical top-kprocessing. Score is not the only factor we should concern. The interplay between score and membership uncertainty makes computation complex. Powerful computing capability of Graphic Processing Unit(GPU) is needed in the processing of this kind of queries if we want to acquire the results as soon as possible. Using GPU with batch mode, we present a CPUGPU cooperative computing framework to processing top-k queries in uncertain database. Two parallel GPU algorithms are designed to solve the problem specifically. Moreover, a "label-confidence" data format conversion is proposed to reduce CPU-GPU communication. We also suggest an error correction method with the heap-based algorithm to improve the accuracy and correction of the results. Experimental results show that the CPU-GPU framework provides a better performance and it is quite efficiency in handling uncertain top-k problem.

[1]  Stanley B. Zdonik,et al.  Top-k queries on uncertain data: on score distribution and typical answers , 2009, SIGMOD Conference.

[2]  Christos Doulkeridis,et al.  On efficient top-k query processing in highly distributed environments , 2008, SIGMOD Conference.

[3]  Jian Pei,et al.  Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[4]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Trans. Knowl. Data Eng..

[5]  Jeffrey Xu Yu,et al.  Sliding-window top-k queries on uncertain streams , 2008, The VLDB Journal.

[6]  William Kahan,et al.  Pracniques: further remarks on reducing truncation errors , 1965, CACM.

[7]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[8]  Jian Pei,et al.  Ranking queries on uncertain data: a probabilistic threshold approach , 2008, SIGMOD Conference.

[9]  Gerhard Weikum,et al.  KLEE: A Framework for Distributed Top-k Query Algorithms , 2005, VLDB.

[10]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Transactions on Knowledge and Data Engineering.

[11]  Philip S. Yu,et al.  A Survey of Uncertain Data Algorithms and Applications , 2009, IEEE Transactions on Knowledge and Data Engineering.

[12]  Ke Yi,et al.  Dynamic Structures for Top- k Queries on Uncertain Data , 2007, ISAAC.