Formal context coverage based on isolated labels: An efficient solution for text feature extraction

Different available data as images, texts, or database may be mapped into an equivalent or approximate binary relation. A text may be considered as a binary relation relating sentences to words, while a numerical table may be represented by a binary relation after using some scaling approach. A social network may be also represented by a formal context. The objective of this paper is to present an original approach for covering a binary relation by formal concepts based on isolated single or multiple properties, i.e., those belonging to only one concept. As a matter of fact, isolated properties are efficiently used for discriminating and labeling concepts. The latter are used for browsing in a corpora, or in a document by navigating through associated labels. By using fringe relations, the presented approach compared to those of the literature has the advantage of offering a relevant feature of a context by significant labels. Carried out experiments show the benefits of the introduced approach.

[1]  Bernhard Ganter,et al.  Formal Concept Analysis , 2013 .

[2]  Ali Jaoua,et al.  Décomposition Rectangulaire Optimale D’une Relation Binaire: Application Aux Bases De Données Documentaires , 1994 .

[3]  M. F. Porter,et al.  An algorithm for suffix stripping , 1997 .

[4]  J. Riguet,et al.  Relations binaires, fermetures, correspondances de Galois , 1948 .

[5]  Sadok Ben Yahia,et al.  Anthropocentric Visualisation of Optimal Cover of Association Rules , 2010, CLA.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Kweku-Muata Osei-Bryson,et al.  Towards supporting expert evaluation of clustering results using a data mining process model , 2010, Inf. Sci..

[8]  Ali Jaoua,et al.  Using difunctional relations in information organization , 2000, Inf. Sci..

[9]  Anne Kao,et al.  Text mining and natural language processing: introduction for the special issue , 2005, SKDD.

[10]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[11]  Vilém Vychodil,et al.  Discovery of optimal factors in binary data via a novel method of matrix decomposition , 2010, J. Comput. Syst. Sci..

[12]  Samir Elloumi,et al.  Data Mining, Reasoning and Incremental Information Retrieval through Non Enlargeable Rectangular Relation Coverage , 2009, RelMiCS.

[13]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[14]  Samir Elloumi,et al.  Financial events detection by conceptual news categorization , 2010, 2010 10th International Conference on Intelligent Systems Design and Applications.

[15]  Paul D. Scott,et al.  New Coupling and Cohesion Metrics for Evaluation of Software Component Reusability , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[16]  Engelbert Mephu Nguifo,et al.  Frequent closed itemset based algorithms: a thorough structural and analytical survey , 2006, SKDD.

[17]  Dirk Cattrysse,et al.  Topic identification based on document coherence and spectral analysis , 2011, Inf. Sci..

[18]  Rokia Missaoui,et al.  Mthodes de Classification Conceptuelle Bases sur les Treillis de Galois et Applications , 1995 .