Towards a Finer Assessment of Extraction Contexts Sparseness

It is widely recognized that the performances of frequent closed itemset mining algorithms are closely dependent on the type of handled extraction contexts, i.e., sparse or dense. In this paper, we address an important question: how can we formally define the sparseness of a given extraction context and assess its value? As an answer, this paper presents a study in which we deal with the problem of assessment of an extraction context's sparseness. Indeed, using the framework of the Succinct system of minimal generators, we present a new sparseness measure which results from the aggregation of two complementary measures, namely the succinctness and compactness measures of each equivalence class, induced by the closure operator. Preliminary experiments mainly permit to rectify the classification of benchmark contexts and confirm our claim that the "dense" and "sparse" qualifications are not absolute ones.

[1]  Luo Junzhou,et al.  Semantic access control in grid computing , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[2]  Ravi S. Sandhu,et al.  Role-Based Access Control Models , 1996, Computer.

[3]  Richard Emilion,et al.  Size of random Galois lattices and number of closed frequent itemsets , 2009, Discret. Appl. Math..

[4]  Jean-Marc Petit,et al.  A thorough experimental study of datasets for frequent itemsets , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[5]  Jean-François Boulicaut,et al.  A Survey on Condensed Representations for Frequent Sets , 2004, Constraint-Based Mining and Inductive Databases.

[6]  Engelbert Mephu Nguifo,et al.  Succinct System of Minimal Generators: A Thorough Study, Limitations and New Definitions , 2006, CLA.

[7]  Gerd Stumme,et al.  Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[8]  Engelbert Mephu Nguifo,et al.  Frequent closed itemset based algorithms: a thorough structural and analytical survey , 2006, SKDD.

[9]  José M. Troya,et al.  Applying the semantic Web layers to access control , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[10]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[11]  Xiang Zhang,et al.  OREL: an ontology-based rights expression language , 2004, WWW Alt. '04.

[12]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[13]  Sergei O. Kuznetsov,et al.  Comparing performance of algorithms for generating concept lattices , 2002, J. Exp. Theor. Artif. Intell..

[14]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[15]  Rokia Missaoui,et al.  Learning algorithms using a Galois lattice structure , 1991, [Proceedings] Third International Conference on Tools for Artificial Intelligence - TAI 91.

[16]  Sadok Ben Yahia,et al.  Avoiding the itemset closure computation "pitfall" , 2005, CLA.

[17]  Li Qin,et al.  Concept-level access control for the Semantic Web , 2003, XMLSEC '03.

[18]  Ansaf Salleb-Aouissi,et al.  Estimation of the Density of Datasets with Decision Diagrams , 2005, ISMIS.

[19]  Salvatore Orlando,et al.  Statistical properties of transactional databases , 2004, SAC '04.

[20]  Elisa Bertino,et al.  Controlled access and dissemination of XML documents , 1999, WIDM '99.

[21]  Loïck Lhote,et al.  Average Number of Frequent and Closed Patterns in Random Databases , 2005, CAP.

[22]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..