Association Rules Discovery via Approximate Method from Probabilistic Database

Association rules and frequent patterns discovery is always a hot topic in database communities. As real data is often affected by noise, in this paper, we study to find frequent patterns and generate association rules over probabilistic database under the Possible World Semantics. This is technically challenging, since a probabilistic database can have an exponential number of possible worlds. Although several efficient algorithms are proposed in the literature, there is still a large space for improvement due to the redundancy property of frequent patterns over probabilistic data. To address this issue, we employ approximate idea and propose a more efficient algorithm to mine frequent pattern. After that, we present two distinct strategies to obtain the association rules, and design an evaluation model to measure the accuracy of association rules result. Finally, extensive experiments have been done on real databases, demonstrating that the proposed method preforms better than state-of-art methods in most cases.

[1]  Yunhao Liu,et al.  Passive diagnosis for wireless sensor networks , 2010, TNET.

[2]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[3]  Reynold Cheng,et al.  Mining uncertain data with probabilistic guarantees , 2010, KDD.

[4]  Jennifer Widom,et al.  ULDBs: databases with uncertainty and lineage , 2006, VLDB.

[5]  Carson Kai-Sang Leung,et al.  Efficient Mining of Frequent Patterns from Uncertain Data , 2007 .

[6]  Aleksandra Slavkovic,et al.  "Secure" Logistic Regression of Horizontally and Vertically Partitioned Distributed Databases , 2007 .

[7]  Shaojie Tang,et al.  Canopy closure estimates with GreenOrbs: sustainable sensing in the forest , 2009, SenSys '09.

[8]  Lei Chen,et al.  Discovering Threshold-based Frequent Closed Itemsets over Probabilistic Data , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[9]  Feifei Li,et al.  Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations , 2008, IEEE Transactions on Knowledge and Data Engineering.

[10]  Carson Kai-Sang Leung,et al.  A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data , 2008, PAKDD.

[11]  Feifei Li,et al.  Finding frequent items in probabilistic data , 2008, SIGMOD Conference.

[12]  Charu C. Aggarwal,et al.  Frequent pattern mining with uncertain data , 2009, KDD.

[13]  Alan V. Oppenheim,et al.  Discrete-Time Signal Pro-cessing , 1989 .

[14]  Hans-Peter Kriegel,et al.  Probabilistic frequent itemset mining in uncertain databases , 2009, KDD.

[15]  Dan Olteanu,et al.  Fast and Simple Relational Processing of Uncertain Data , 2007, 2008 IEEE 24th International Conference on Data Engineering.

[16]  Sunil Prabhakar,et al.  Querying imprecise data in moving object environments , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[17]  Edward Hung,et al.  Mining Frequent Itemsets from Uncertain Data , 2007, PAKDD.

[18]  Yunhao Liu,et al.  Underground coal mine monitoring with wireless sensor networks , 2009, TOSN.