A Method of Mining Association Rules for Geographical Points of Interest

Association rule (AR) mining represents a challenge in the field of data mining. Mining ARs using traditional algorithms generates a large number of candidate rules, and even if we use binding measures such as support, reliability, and lift, there are still several rules to keep, and domain experts are needed to extract the rules of interest from the remaining rules. The focus of this paper is on whether we can directly provide rule rankings and calculate the proportional relationship between the items in the rules. To address these two questions, this paper proposes a modified FP-Growth algorithm called FP-GCID (novel FP-Growth algorithm based on Cluster IDs) to generate ARs; in addition, a new method called Mean-Product of Probabilities (MPP) is proposed to rank rules and compute the proportion of items for one rule. The experiment is divided into three phases: the DBSCAN (Density-Based Scanning Algorithm with Noise) algorithm is used to cluster the geographic interest points and map the obtained clusters into corresponding transaction data; FP-GCID is used to generate ARs, which contain cluster information; and MPP is used to choose the best rule based on the rankings. Finally, a visualization of the rules is used to validate whether the two previously stated requirements were fulfilled.

[1]  Shusaku Tsumoto,et al.  Evaluation of rule interestingness measures in medical knowledge discovery in databases , 2007, Artif. Intell. Medicine.

[2]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[3]  Hari Om,et al.  Significant patterns for oral cancer detection: association rule on clinical examination and history data , 2014, Network Modeling Analysis in Health Informatics and Bioinformatics.

[4]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[5]  Naveen Kumar,et al.  Novelty as a Measure of Interestingness in Knowledge Discovery , 2008 .

[6]  Gregory Piatetsky-Shapiro,et al.  Knowledge Discovery in Databases: An Overview , 1992, AI Mag..

[7]  Bay Vo,et al.  Interestingness measures for association rules: Combination between lattice and hash tables , 2011, Expert Syst. Appl..

[8]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[9]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[10]  Chengqi Zhang,et al.  Discovering Interesting Association Rules by Clustering , 2004, Australian Conference on Artificial Intelligence.

[11]  Davy Janssens,et al.  Improving Associative Classification by Incorporating Novel Interestingness Measures , 2005, ICEBE.

[12]  Balaji Padmanabhan,et al.  Unexpectedness as a Measure of Interestingness in Knowledge Discovery , 1999, Decis. Support Syst..

[13]  Xiaolei Ma,et al.  Mining smart card data for transit riders’ travel patterns , 2013 .

[14]  Tzung-Pei Hong,et al.  An effective tree structure for mining high utility itemsets , 2011, Expert Syst. Appl..

[15]  Howard J. Hamilton,et al.  Mining itemset utilities from transaction databases , 2006, Data Knowl. Eng..

[16]  Patrick Meyer,et al.  On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid , 2008, Eur. J. Oper. Res..

[17]  Shashi Shekhar,et al.  Multilevel hypergraph partitioning: applications in VLSI domain , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[18]  Vipin Kumar,et al.  Clustering Based On Association Rule Hypergraphs , 1997, DMKD.

[19]  Zaher Al Aghbari,et al.  Interestingness filtering engine: Mining Bayesian networks for interesting patterns , 2009, Expert Syst. Appl..

[20]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[21]  Guang-Ho Cha,et al.  Index Clustering for High-Performance Sequential Index Access , 2004, DASFAA.

[22]  Alex Alves Freitas,et al.  On Objective Measures of Rule Surprisingness , 1998, PKDD.

[23]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[24]  Yen-Liang Chen,et al.  Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism , 2004, Decision Support Systems.

[25]  Sattar Hashemi,et al.  Detecting intrusion transactions in database systems:a novel approach , 2013, Journal of Intelligent Information Systems.

[26]  Wen-Yang Lin,et al.  MCFPTree: An FP-tree-based algorithm for multi-constraint patterns discovery , 2010, Int. J. Bus. Intell. Data Min..

[27]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[28]  Lailan Syaufina,et al.  Hotspot Distribution Analyses Based on Peat Characteristics Using Density-based Spatial Clustering , 2015 .

[29]  I-En Liao,et al.  An improved frequent pattern growth method for mining association rules , 2011, Expert Syst. Appl..

[30]  Naveen Kumar,et al.  Novelty Framework for Knowledge Discovery in Databases , 2004, DaWaK.

[31]  Ickjai Lee,et al.  Mining Points-of-Interest Association Rules from Geo-tagged Photos , 2013, 2013 46th Hawaii International Conference on System Sciences.

[32]  Tharam S. Dillon,et al.  Interestingness measures for association rules based on statistical validity , 2011, Knowl. Based Syst..

[33]  Cory J. Butz,et al.  A Foundational Approach to Mining Itemset Utilities from Databases , 2004, SDM.

[34]  Jia-Ling Koh,et al.  An Efficient Approach for Maintaining Association Rules Based on Adjusting FP-Tree Structures1 , 2004, DASFAA.

[35]  B. Shekar,et al.  A Framework for Evaluating Knowledge-Based Interestingness of Association Rules , 2004, Fuzzy Optim. Decis. Mak..

[36]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[37]  Zaifang Zhang,et al.  An association rule mining and maintaining approach in dynamic database for aiding product–service system conceptual design , 2012 .

[38]  Amir Abbas Shojaie,et al.  Clustering and association rules in analyzing the efficiency of maintenance system of an urban bus network , 2012, Int. J. Syst. Assur. Eng. Manag..

[39]  Young-Koo Lee,et al.  Efficient single-pass frequent pattern mining using a prefix-tree , 2009, Inf. Sci..

[40]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[41]  Zhan Li,et al.  Knowledge and Information Systems , 2007 .

[42]  Laks V. S. Lakshmanan,et al.  Exploiting succinct constraints using FP-trees , 2002, SKDD.

[43]  Osmar R. Zaïane,et al.  Incremental mining of frequent patterns without candidate generation or support constraint , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[44]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[45]  Prasanta K. Jana,et al.  A Prototype-Based Modified DBSCAN for Gene Clustering , 2012 .