Discovering shared conceptualizations in folksonomies

Social bookmarking tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. Unlike ontologies, shared conceptualizations are not formalized, but rather implicit. We present a new data mining task, the mining of all frequent tri-concepts, together with an efficient algorithm, for discovering these implicit shared conceptualizations. Our approach extends the data mining task of discovering all closed itemsets to three-dimensional data structures to allow for mining folksonomies. We provide a formal definition of the problem, and present an efficient algorithm for its solution. Finally, we show the applicability of our approach on three large real-world examples.

[1]  Gerd Stumme,et al.  A Finite State Model for On-Line Analytical Processing in Triadic Contexts , 2005, ICFCA.

[2]  Steffen Staab,et al.  Ontology Learning for the Semantic Web , 2002, IEEE Intell. Syst..

[3]  Gerd Stumme,et al.  Begriffliche Wissensverarbeitung: Methoden und Anwendungen , 2000 .

[4]  Gerd Stumme,et al.  Formal Concept Analysis: foundations and applications , 2005 .

[5]  Gerd Stumme,et al.  Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[6]  Peter Mika,et al.  Ontologies are us: A unified model of social networks and semantics , 2005, J. Web Semant..

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Klaus Biedermann,et al.  How Triadic Diagrams Represent Conceptual Structures , 1997, ICCS.

[9]  Toon Calders,et al.  Mining All Non-derivable Frequent Itemsets , 2002, PKDD.

[10]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[11]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[12]  R. Wille,et al.  Grundlagen einer Triadischen Begriffsanalyse , 2000 .

[13]  Pierre Nicole,et al.  La Logique Ou L'art De Penser,: Contenant Outre Les Regles Communes, Plusieurs Observations Nouvelles, Propres À Former Le Jugement.. , 2010 .

[14]  G. L. Collected Papers , 1912, Nature.

[15]  Andreas Hotho,et al.  TRIAS--An Algorithm for Mining Iceberg Tri-Lattices , 2006, Sixth International Conference on Data Mining (ICDM'06).

[16]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[17]  Marianne Huchard,et al.  On automatic class insertion with overloading , 1996, OOPSLA '96.

[18]  Jean-François Boulicaut,et al.  Approximation of Frequency Queris by Means of Free-Sets , 2000, PKDD.

[19]  Jiawei Han,et al.  Metarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes , 1997, KDD.

[20]  P. Nicole,et al.  La logique, ou, L'art de penser , 1993 .

[21]  Rafik Taouil Algorithmique du treillis des fermes : application a l'analyse formelle de concepts et aux bases de donnees , 2000 .

[22]  Gunter Saake,et al.  Merging inheritance hierarchies for database integration , 1998, Proceedings. 3rd IFCIS International Conference on Cooperative Information Systems (Cat. No.98EX122).

[23]  François Rioult,et al.  Extraction de connaissances dans les bases de donn'ees comportant des valeurs manquantes ou un grand nombre d'attributs , 2005 .

[24]  Klaus Biedermann,et al.  Triadic Galois connections , 1997 .

[25]  Ramakrishnan Srikant,et al.  Discovering Trends in Text Databases , 1997, KDD.

[26]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[27]  Lotfi Lakhal,et al.  iO2 - An Algorithmic Method for Building Inheritance Graphs in Object Database Design , 1996, ER.

[28]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[29]  Michele Missikoff,et al.  An Algorithm for Insertion into a Lattice: Application to Type Classification , 1989, FODO.

[30]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[31]  Gerd Stumme Conceptual knowledge discovery with frequent concept lattices , 1999 .

[32]  G. Lealand Review , 2010 .

[33]  Rudolf Wille,et al.  Conceptual Clustering via Convex-Ordinal Structures , 1993 .

[34]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[35]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[36]  Philipp Cimiano,et al.  Ontology learning and population from text - algorithms, evaluation and applications , 2006 .

[37]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[38]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[39]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[40]  Gerd Stumme,et al.  Formal Concept Analysis , 2009, Handbook on Ontologies.

[41]  Bernhard Ganter,et al.  TRIPAT: a Model for Analyzing Three-Mode Binary Data , 1994 .

[42]  Rokia Missaoui,et al.  Design of Class Hierarchies Based on Concept (Galois) Lattices , 1998, Theory Pract. Object Syst..

[43]  Otto Opitz,et al.  Information and Classification , 1993 .

[44]  Klaus Biedermann Powerset Trilattices , 1998, ICCS.

[45]  Andreas Hotho,et al.  Mining Association Rules in Folksonomies , 2006, Data Science and Classification.

[46]  Heikki Mannila,et al.  Methods and Problems in Data Mining , 1997, ICDT.

[47]  Vittorio Loreto,et al.  Collaborative Tagging and Semiotic Dynamics , 2006, ArXiv.

[48]  Guy W. Mineau,et al.  Automatic Structuring of Knowledge Bases by Conceptual Clustering , 1995, IEEE Trans. Knowl. Data Eng..

[49]  Valentin Robu,et al.  The Dynamics and Semantics of Collaborative Tagging , 2006, SAAW@ISWC.

[50]  Tony Hammond,et al.  Social Bookmarking Tools (I): A General Overview , 2005, D Lib Mag..

[51]  Claudio Carpineto,et al.  GALOIS: An Order-Theoretic Approach to Conceptual Clustering , 1993, ICML.

[52]  Marcel Ausloos,et al.  Contextualising tags in collaborative tagging systems , 2009, HT '09.

[53]  Nicola Guarino,et al.  Formal ontology, conceptual analysis and knowledge representation , 1995, Int. J. Hum. Comput. Stud..

[54]  Tony Hammond,et al.  Social Bookmarking Tools (II): A Case Study - Connotea , 2005, D Lib Mag..

[55]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[56]  R GruberThomas Toward principles for the design of ontologies used for knowledge sharing , 1995 .

[57]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[58]  Andreas Hotho,et al.  Analysis of the Publication Sharing Behaviour in BibSonomy , 2007, ICCS.

[59]  Nicolas Pasquier,et al.  Closed Set Based Discovery of Small Covers for Association Rules , 1999, Proc. 15èmes Journées Bases de Données Avancées, BDA.

[60]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[61]  Kitsana Waiyamai,et al.  Towards an Object Database Approach for Managing Concept Lattices , 1997, ER.

[62]  Christophe Rigotti,et al.  A condensed representation to find frequent patterns , 2001, PODS '01.

[63]  Gerd Stumme,et al.  Off to new shores: conceptual knowledge discovery and processing , 2003, Int. J. Hum. Comput. Stud..

[64]  Rudolf Wille,et al.  The Basic Theorem of triadic concept analysis , 1995 .

[65]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[66]  Yves Bastide,et al.  Intelligent Structuring and Reducing of Association Rules with Formal Concept Analysis , 2001, KI/ÖGAI.

[67]  Philipp Cimiano,et al.  Ontology Learning from Text: Methods, Evaluation and Applications , 2005 .

[68]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[69]  Luc Steels,et al.  The Origins of Ontologies and Communication Conventions in Multi-Agent Systems , 2004, Autonomous Agents and Multi-Agent Systems.

[70]  Sudha Ram,et al.  Proceedings of the 1997 ACM SIGMOD international conference on Management of data , 1997, ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems.

[71]  Marie-Laure Mugnier,et al.  Conceptual Structures: Theory, Tools and Applications , 1998, Lecture Notes in Computer Science.

[72]  Guido Boella,et al.  Normative framework for normative system change , 2009, AAMAS 2009.

[73]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[74]  Rudolf Wille,et al.  A Triadic Approach to Formal Concept Analysis , 1995, ICCS.

[75]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Association Rule Mining , 2007 .

[76]  Dimitrios Gunopulos,et al.  Workshop report: 2000 ACM SIGMOD workshop on research issues in data mining and knowledge discovery , 2000, SKDD.

[77]  Steffen Staab Emergent Semantics , 2002, IEEE Intell. Syst..

[78]  Hector Garcia-Molina,et al.  Collaborative Creation of Communal Hierarchical Taxonomies in Social Tagging Systems , 2006 .

[79]  Andreas Hotho,et al.  BibSonomy: a social bookmark and publication sharing system , 2006 .

[80]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.