Association Rule Mining on Remotely Sensed Images Using P-trees

Association Rule Mining, originally proposed for market basket data, has potential applications in many areas. Remote Sensed Imagery (RSI) data is one of the promising application areas. Extracting interesting patterns and rules from datasets composed of images and associated ground data, can be of importance in precision agriculture, community planning, resource discovery and other areas. However, in most cases the image data sizes are too large to be mined in a reasonable amount of time using existing algorithms. In this paper, we propose an approach to derive association rules on RSI data using Peano Count Tree (P-tree) structure. P-tree structure, proposed in our previous work, provides a lossless and compressed representation of image data. Based on P-trees, an efficient association rule mining algorithm P-ARM with fast support calculation and significant pruning techniques are introduced to improve the efficiency of the rule mining process. P-ARM algorithm is implemented and compared with FP-growth and Apriori algorithms. Experimental results showed that our algorithm is superior for association rule mining on RSI spatial data.

[1]  Pang-Ning Tan,et al.  Interestingness Measures for Association Patterns : A Perspective , 2000, KDD 2000.

[2]  David Wai-Lok Cheung,et al.  Effect of Data Distribution in Parallel Mining of Associations , 1999, Data Mining and Knowledge Discovery.

[3]  Srinivasan Parthasarathy,et al.  Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[4]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[5]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[6]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[7]  Carlos Ordonez,et al.  Discovering association rules based on image content , 1999, Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries.

[8]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[9]  Yasuhiko Morimoto,et al.  Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[10]  H. Kriegel,et al.  Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support , 2000, Data Mining and Knowledge Discovery.

[11]  Heikki Mannila,et al.  A database perspective on knowledge discovery , 1996, CACM.

[12]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[13]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[14]  Man Hon Wong,et al.  Mining fuzzy association rules in databases , 1998, SGMD.

[15]  Giuseppe Psaila,et al.  An Extension to SQL for Mining Association Rules , 1998, Data Mining and Knowledge Discovery.

[16]  Arbee L. P. Chen,et al.  A Graph-Based Approach for Discovering Various Types of Association Rules , 2001, IEEE Trans. Knowl. Data Eng..

[17]  Qin Ding,et al.  k-nearest Neighbor Classification on Spatial Data Streams Using P-trees , 2002, PAKDD.

[18]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[19]  G.J. Minden,et al.  A survey of active network research , 1997, IEEE Communications Magazine.

[20]  Nie Yong Mining quantitative association rules , 2000 .

[21]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[22]  Philip S. Yu,et al.  Efficient mining of weighted association rules (WAR) , 2000, KDD '00.

[23]  Qiang Ding,et al.  Deriving High Confidence Rules from Spatial Data Using Peano Count Trees , 2001, WAIM.

[24]  Jiawei Han,et al.  Spatial Data Mining: Progress and Challenges , 1996, Workshop on Research Issues on Data Mining and Knowledge Discovery.

[25]  David Wai-Lok Cheung,et al.  Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules , 1998, Data Mining and Knowledge Discovery.

[26]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[27]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[28]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[29]  Mohammed J. Zaki Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[30]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[31]  Francis Eng Hock Tay,et al.  A Modified Chi2 Algorithm for Discretization , 2002, IEEE Trans. Knowl. Data Eng..

[32]  Pat Langley,et al.  Static Versus Dynamic Sampling for Data Mining , 1996, KDD.

[33]  Jiawei Han,et al.  Mining recurrent items in multimedia with progressive resolution refinement , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[34]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[35]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[36]  Larry A. Rendell,et al.  Constructive Induction On Decision Trees , 1989, IJCAI.

[37]  Shamkant B. Navathe,et al.  Mining for strong negative associations in a large database of customer transactions , 1998, Proceedings 14th International Conference on Data Engineering.

[38]  Ke Wang,et al.  Growing decision trees on support-less association rules , 2000, KDD '00.

[39]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[40]  William Perrizo,et al.  Atomic commitment and query processing in database systems over wide-area active networks , 1999 .

[41]  Andrew W. Moore,et al.  Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets , 1998, J. Artif. Intell. Res..

[42]  Ron Kohavi,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998 .

[43]  Qin Ding,et al.  Fast approach for association rule mining on remotely sensed imagery , 2000, Computers and Their Applications.

[44]  Wynne Hsu,et al.  Mining association rules with multiple minimum supports , 1999, KDD '99.

[45]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD 2000.

[46]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[47]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[48]  Srinivasan Parthasarathy,et al.  Parallel Data Mining for Association Rules on Shared-Memory Multi-Processors , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[49]  Alex Alves Freitas,et al.  Mining Very Large Databases with Parallel Processing , 1997, The Kluwer International Series on Advances in Database Systems.

[50]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[51]  Qin Ding,et al.  The P-tree algebra , 2002, SAC '02.

[52]  Jörg Rech,et al.  Knowledge Discovery in Databases , 2001, Künstliche Intell..

[53]  Michio Nakanishi,et al.  Computational Complexity of Finding Meaningful Association Rules , 1999 .

[54]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[55]  Jiawei Han,et al.  Mining Multiple-Level Association Rules in Large Databases , 1999, IEEE Trans. Knowl. Data Eng..

[56]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[57]  David Wai-Lok Cheung,et al.  Efficient Mining of Association Rules in Distributed Databases , 1996, IEEE Trans. Knowl. Data Eng..

[58]  Qiang Ding,et al.  Decision tree classification of spatial data streams using Peano Count Trees , 2002, SAC '02.

[59]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[60]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[61]  Jianning Dong,et al.  The application of association rule mining to remotely sensed data , 2000, SAC '00.

[62]  Peter Clark,et al.  Rule Induction with CN2: Some Recent Improvements , 1991, EWSL.

[63]  Elena Baralis,et al.  Designing Templates for Mining Association Rules , 2004, Journal of Intelligent Information Systems.

[64]  Edith Cohen,et al.  Finding Interesting Associations without Support Pruning , 2001, IEEE Trans. Knowl. Data Eng..

[65]  Tomasz Imielinski,et al.  MSQL: A Query Language for Database Mining , 1999, Data Mining and Knowledge Discovery.

[66]  Giuseppe Psaila,et al.  A New SQL-like Operator for Mining Association Rules , 1996, VLDB.

[67]  Hanan Samet,et al.  The Quadtree and Related Hierarchical Data Structures , 1984, CSUR.

[68]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[69]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[70]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[71]  Qiang Ding,et al.  On Mining Satellite and other Remotely Sensed Images , 2001, DMKD.

[72]  Qin Ding,et al.  Using Active Networks in Parallel Mining of Association Rules , 2000 .

[73]  Nimrod Megiddo,et al.  Discovering Predictive Association Rules , 1998, KDD.

[74]  Robert Meersman,et al.  On the Complexity of Mining Quantitative Association Rules , 1998, Data Mining and Knowledge Discovery.

[75]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[76]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[77]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[78]  Andrew W. Moore,et al.  A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data Sets , 2000, ICML.

[79]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[80]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[81]  Roberto J. Bayardo Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.

[82]  Masaru Kitsuregawa,et al.  Parallel mining algorithms for generalized association rules with classification hierarchy , 1997, SIGMOD '98.

[83]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[84]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[85]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[86]  Kamal Ali,et al.  Partial Classification Using Association Rules , 1997, KDD.

[87]  Kyuseok Shim,et al.  Mining optimized association rules with categorical and numeric attributes , 1998, Proceedings 14th International Conference on Data Engineering.

[88]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[89]  Geoffrey I. Webb OPUS: An Efficient Admissible Algorithm for Unordered Search , 1995, J. Artif. Intell. Res..

[90]  Rajeev Motwani,et al.  Dynamic miss-counting algorithms: finding implication and similarity rules with confidence pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[91]  Qin Ding,et al.  Using Neural Networks for Clustering on RSI Data and Related Spatial Data , 2000 .

[92]  Elena Marchiori,et al.  Mining Clusters with Association Rules , 1999, IDA.

[93]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[94]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[95]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[96]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[97]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[98]  Laks V. S. Lakshmanan,et al.  Optimization of constrained frequent set queries with 2-variable constraints , 1999, SIGMOD '99.

[99]  Andreas Mueller,et al.  Fast sequential and parallel algorithms for association rule mining: a comparison , 1995 .

[100]  Ke Wang,et al.  Interestingness-Based Interval Merger for Numeric Association Rules , 1998, KDD.

[101]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[102]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[103]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[104]  Ke Wang,et al.  Mining confident rules without support requirement , 2001, CIKM '01.

[105]  Hans-Peter Kriegel,et al.  Spatial Data Mining: A Database Approach , 1997, SSD.

[106]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[107]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[108]  Masaru Kitsuregawa,et al.  Hash based parallel algorithms for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[109]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[110]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[111]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.