Enhancing spatial association rule mining in geographic databases

Association rule mining algorithms generate huge amounts of patterns. In spatial association rule mining this problem increases because a significant amount of associations is well known a priori. This paper presents a novel approach for mining spatial association rules, using background knowledge. The main contributions include the use of geographic database schemas and geo-ontologies for (i) the improvement of geographic data pre-processing, (ii) the elimination of well known patterns and (iii) the generation of maximal frequent patterns without redundant and non-interesting associations.

[1]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[2]  Robert Laurini,et al.  A Methodology for Spatial Consistency Improvement of Geographic Databases , 2000, GeoInformatica.

[3]  MAX J. EGENHOFER,et al.  Point Set Topological Relations , 1991, Int. J. Geogr. Inf. Sci..

[4]  Nina Edelweiss,et al.  GeoFrame-T: a temporal conceptual framework for data modeling , 2001, GIS '01.

[5]  Jiawei Han,et al.  Mining knowledge at multiple concept levels , 1995, CIKM '95.

[6]  V. Bogorny,et al.  Towards Elimination of Well Known Geographic Patterns in Spatial Association Rule Mining , 2006, 2006 3rd International IEEE Conference Intelligent Systems.

[7]  Willi Klösgen,et al.  Spatial Subgroup Mining Integrated in an Object-Relational Spatial Database , 2002, PKDD.

[8]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[9]  Daniel Xodo,et al.  Data Mining With Ontologies: Implementations, Findings and Frameworks , 2007 .

[10]  Stephen McKearney,et al.  Reverse Engineering Databases for Knowledge Discovery , 1996, KDD.

[11]  Ramez Elmasri,et al.  Fundamentals of Database Systems , 1989 .

[12]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[13]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..

[14]  Gerd Stumme,et al.  Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[15]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[16]  Vania Bogorny,et al.  GEOARM: an Interoperable Framework to Improve Geographic Data Preprocessing and Spatial Association Rule Mining , 2006, SEKE.

[17]  M. Egenhofer,et al.  Point-Set Topological Spatial Relations , 2001 .

[18]  Eliseo Clementini,et al.  Mining multiple-level spatial association rules for objects with a broad boundary , 2000, Data Knowl. Eng..

[19]  Michelangelo Ceci,et al.  Discovery of spatial association rules in geo-referenced census data: A relational mining approach , 2003, Intell. Data Anal..

[21]  Jiawei Han,et al.  GeoMiner: a system prototype for spatial data mining , 1997, SIGMOD '97.

[22]  Nectaria Tryfona,et al.  A Model for Expressing topological Integrity Constraints in Geographic Databases , 1992, Spatio-Temporal Reasoning.

[23]  Alberto H. F. Laender,et al.  OMT-G: An Object-Oriented Data Model for Geographic Applications , 2001, GeoInformatica.

[24]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[25]  Vania Bogorny,et al.  Mining Maximal Generalized Frequent Geographic Patterns with Knowledge Constraints , 2006, Sixth International Conference on Data Mining (ICDM'06).

[26]  Jean-François Boulicaut,et al.  Mining free itemsets under constraints , 2001, Proceedings 2001 International Database Engineering and Applications Symposium.

[27]  Yasuhiko Morimoto,et al.  Mining optimized association rules for numeric attributes , 1996, J. Comput. Syst. Sci..

[28]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[29]  Stefano Spaccapietra,et al.  Modeling spatial data in the MADS conceptual model , 1998 .

[30]  Vania Bogorny,et al.  A Reuse-based Spatial Data Preparation Framework for Data Mining , 2005, SEKE.

[31]  Peretz Shoval,et al.  Database Reverse Engineering: From the Relational to the Binary Relationship model , 1993, Data Knowl. Eng..

[32]  Mário J. Silva,et al.  GKB - Geographic Knowledge Base , 2005 .

[33]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[34]  N. Guarino,et al.  Formal Ontology in Information Systems : Proceedings of the First International Conference(FOIS'98), June 6-8, Trento, Italy , 1998 .

[35]  Hiroki Arimura,et al.  LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets , 2003, FIMI.

[36]  Max J. Egenhofer,et al.  On the Equivalence of Topological Relations , 1995, Int. J. Geogr. Inf. Sci..

[37]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[38]  Donato Malerba,et al.  Inducing Multi-Level Association Rules from Multiple Relations , 2004, Machine Learning.

[39]  Agnès Voisard,et al.  Spatial Databases: With Application to GIS , 2001 .

[40]  Michelangelo Ceci,et al.  Mining and Filtering Multi-level Spatial Association Rules with ARES , 2005, ISMIS.

[41]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[42]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[43]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[44]  H. Kriegel,et al.  Spatial Data Mining: Database Primitives, Algorithms and Efficient DBMS Support , 2000, Data Mining and Knowledge Discovery.

[45]  Vania Bogorny,et al.  Towards the Reduction of Spatial Joins for Knowledge Discovery in Geographic Databases Using Geo-Ontologies and Spatial Integrity Constraints , 2005 .

[46]  Sophie Cockcroft A Taxonomy of Spatial Data Integrity Constraints , 1997, GeoInformatica.

[47]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[48]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[49]  Solange Oliveira Rezende,et al.  Combining Quality Measures to Identify Interesting Association Rules , 2004, IBERAMIA.

[50]  Carolina Martins Soares Silva Utilizando o processo de descoberta de conhecimento em banco de dados para identificar candidatos a padrão de análise para bancos de dados geográficos , 2003 .

[51]  Carsten Pohle Integrating and Updating Domain Knowledge with Data Mining , 2003, VLDB PhD Workshop.

[52]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[53]  Nicola Guarino,et al.  Formal Ontology and Information Systems , 1998 .

[54]  Vania Bogorny,et al.  Spatial Data Mining : From Theory to Practice with Free Software , 2007 .

[55]  Pieter Adriaans,et al.  Data mining , 1996 .

[56]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[57]  Dino Pedreschi,et al.  ExAMiner: optimized level-wise frequent pattern mining with monotone constraints , 2003, Third IEEE International Conference on Data Mining.

[58]  Shashi Shekhar,et al.  Spatial Databases: A Tour , 2003 .

[59]  Franco Turini,et al.  Knowledge Discovery from Geographical Data , 2008, Mobility, Data Mining and Privacy.

[60]  Shashi Shekhar,et al.  Detecting graph-based spatial outliers: algorithms and applications (a summary of results) , 2001, KDD '01.

[61]  J ZakiMohammed,et al.  Advances in frequent itemset mining implementations , 2004 .

[62]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[63]  Balaji Padmanabhan,et al.  A Belief-Driven Method for Discovering Unexpected Patterns , 1998, KDD.

[64]  Bart Goethals,et al.  Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.

[65]  Shashi Shekhar,et al.  A join-less approach for co-location pattern mining: a summary of results , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[66]  Christophe Marsala,et al.  Fuzzy Spatial OQL for Fuzzy Knowledge Discovery in Databases , 1998, PKDD.

[67]  James H. Cross,et al.  Reverse engineering and design recovery: a taxonomy , 1990, IEEE Software.

[68]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[69]  Donato Malerba,et al.  Discovering Associations between Spatial Objects: An ILP Application , 2001, ILP.

[70]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[71]  Mário J. Silva,et al.  A Geographic Knowledge Base for Semantic Web Applications , 2005, SBBD.

[72]  Yasuhiko Morimoto,et al.  Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases , 1998, VLDB.

[73]  S. Shekhar,et al.  Discovering Co-location Patterns from Spatial Datasets : A General Approach , 2004 .

[74]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[75]  Shashi Shekhar,et al.  A partial join approach for mining co-location patterns , 2004, GIS '04.

[76]  Hiroki Arimura,et al.  LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[77]  Francesco Bonchi,et al.  On closed constrained frequent pattern mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[78]  Max J. Egenhofer,et al.  Advances in Spatial Databases , 1997, Lecture Notes in Computer Science.

[79]  Nicola Guarino,et al.  Formal ontology, conceptual analysis and knowledge representation , 1995, Int. J. Hum. Comput. Stud..

[80]  Donato Malerba,et al.  Empowering a GIS with inductive learning capabilities: the case of INGENS , 2003, Comput. Environ. Urban Syst..

[81]  Ian Witten,et al.  Data Mining , 2000 .

[82]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[83]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[84]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[85]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[86]  Vania Bogorny,et al.  Weka-GDPM – Integrating Classical Data Mining Toolkit to Geographic Information Systems , 2006 .

[87]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[88]  Jiawei Han,et al.  Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases , 1994, KDD Workshop.

[89]  Dino Pedreschi,et al.  ExAnte: Anticipated Data Reduction in Constrained Pattern Mining , 2003, PKDD.

[90]  Raghu Ramakrishnan,et al.  Proceedings : KDD 2000 : the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-23, 2000, Boston, MA, USA , 2000 .

[91]  Ralf Hartmut Güting Dr.rer.nat An introduction to spatial database systems , 2005, The VLDB Journal.

[92]  Donato Malerba,et al.  SDMOQL: An OQL-based Data Mining Query Language for Map Interpretation Tasks , 2002 .

[93]  Ralf Hartmut Güting,et al.  An introduction to spatial database systems , 1994, VLDB J..

[94]  Vania Bogorny,et al.  Mining frequent geographic patterns with knowledge constraints , 2006, GIS '06.

[95]  Vania Bogorny,et al.  Reducing uninteresting spatial association rules in geographic databases using background knowledge: a summary of results , 2008, Int. J. Geogr. Inf. Sci..

[96]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[97]  Jugurta Lisboa Filho,et al.  Specifying analysis patterns for geographic databases on the basis of a conceptual framework , 1999, GIS '99.

[98]  Beng Chin Ooi,et al.  Discovery of General Knowledge in Large Spatial Databases , 1993 .

[99]  Wynne Hsu,et al.  Analyzing the Subjective Interestingness of Association Rules , 2000, IEEE Intell. Syst..

[100]  Jun Wei Liu,et al.  Mining Association Rules in Spatio‐Temporal Data: An Analysis of Urban Socioeconomic and Land Cover Change , 2005, Trans. GIS.