Ranking in Distributed Uncertain Database Environments

Distributed data processing is a major field in nowadays applications. Many applications collect and process data from distributed nodes to gain overall results. Large amount of data transfer and network delay made data processing in a centralized manner a hard operation representing an important problem. A very common way to solve this problem is ranking queries. Ranking or top- k queries concentrate only on the highest ranked tuples according to user's interest. Another issue in most nowadays applications is data uncertainty. Many techniques were introduced for modeling, managing, and processing uncertain databases. Although these techniques were efficient, they didn't deal with distributed data uncertainty. This paper deals with both data uncertainty and distribution based on ranking queries. A novel framework is proposed for ranking distributed uncertain data. The framework has a suite of novel algorithms for ranking data and monitoring updates. These algorithms help in reducing the communication rounds used and amount of data transmitted while achieving efficient and effective ranking. Experimental results show that the proposed framework has a great impact in reducing communication cost compared to other techniques. DOI: http://dx.doi.org/10.11591/ijece.v4i4.5920

[1]  Lei Zou,et al.  Efficient Top-k Monitoring of Abnormality in Sensor Networks , 2009, 2009 Ninth IEEE International Conference on Computer and Information Technology.

[2]  Hua-Gang Li,et al.  Efficient Processing of Distributed Top-k Queries , 2005, DEXA.

[3]  Xi Zhang,et al.  Semantics and evaluation of top-k queries in probabilistic databases , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[4]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[5]  Assaf Schuster,et al.  A geometric approach to monitoring threshold functions over distributed data streams , 2006, Ubiquitous Knowledge Discovery.

[6]  Y. AbdulAzeem,et al.  Ranking in uncertain distributed database environments , 2012, 2012 Seventh International Conference on Computer Engineering & Systems (ICCES).

[7]  Gerhard Weikum,et al.  Distributed top-k aggregation queries at large , 2009, Distributed and Parallel Databases.

[8]  Zhe Wang,et al.  Efficient top-K query calculation in distributed networks , 2004, PODC '04.

[9]  Luis Gravano,et al.  Evaluating top-k queries over Web-accessible databases , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Wei Hong,et al.  Model-Driven Data Acquisition in Sensor Networks , 2004, VLDB.

[12]  Mao Ye,et al.  Probabilistic Top-k query processing in distributed sensor networks , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[13]  Yon Dohn Chung,et al.  POT: an efficient top-k monitoring method for spatially correlated sensor readings , 2008, DMSN '08.

[14]  Ling Liu,et al.  Queries across Multiple Private Databases , .

[15]  Panos K. Chrysanthis,et al.  Power efficiency through tuple ranking in wireless sensor network monitoring , 2010, Distributed and Parallel Databases.

[16]  Jianliang Xu,et al.  Top-k Monitoring in Wireless Sensor Networks , 2007, IEEE Transactions on Knowledge and Data Engineering.

[17]  Toon Calders,et al.  Efficient Pattern Mining of Uncertain Data with Sampling , 2010, PAKDD.

[18]  Tzung-Pei Hong,et al.  An Integrated MFFP-tree Algorithm for Mining Global Fuzzy Rules from Distributed Databases , 2013, J. Univers. Comput. Sci..

[19]  Christos Doulkeridis,et al.  On efficient top-k query processing in highly distributed environments , 2008, SIGMOD Conference.

[20]  Feifei Li,et al.  Ranking distributed probabilistic data , 2009, SIGMOD Conference.

[21]  Jian Pei,et al.  Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[22]  Christopher Ré,et al.  Efficient Top-k Query Evaluation on Probabilistic Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[23]  Ling Tian,et al.  Efficient building algorithms of decision tree for uniformly distributed uncertain data , 2011, 2011 Seventh International Conference on Natural Computation.

[24]  Tzung-Pei Hong,et al.  A new mining approach for uncertain databases using CUFP trees , 2012, Expert Syst. Appl..

[25]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[26]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[27]  Feifei Li,et al.  Semantics of Ranking Queries for Probabilistic Data , 2011, IEEE Transactions on Knowledge and Data Engineering.