Extracting User-Centric Knowledge on Two Different Spaces: Concepts and Records

The growing demand for eliciting useful knowledge from data calls for techniques that can discover insights (in the form of patterns) that users need. Methodologies for describing intrinsic and relevant properties of data through the extraction of useful patterns, however, work on fixed input data, and the data representation, therefore, constrains the discovered insights. In this regard, this paper aims at providing foundations to make the descriptive knowledge that is extracted by pattern mining more user-centric by relying on flexible data structures defined on two different perspectives: concepts and data records. In this sense, items in data can be grouped into abstract terms through subjective hierarchies of concepts, whereas data records can also be organized based on the users’ subjective perspective. A series of easy-to-follow toy examples are considered for each of the two perspectives to demonstrate the usefulness and necessity of the proposed foundations in pattern mining. Finally, aiming at experimentally testing whether classical pattern mining algorithms can be adapted to such flexible data structures, the experimental analysis comprises different methodologies, including exhaustive search, random search, and evolutionary approaches. All these approaches are based on well-known and widely recognized techniques to demonstrate the usefulness of the provided foundations for future research works and more efficient and specifically designed algorithms. Obtained insights demonstrate the importance of working with subjectivity: an item is a type of soda but belongs to a pack, including two or more soda types.

[1]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[2]  Jerry Chun-Wei Lin,et al.  Exploring Pattern Mining Algorithms for Hashtag Retrieval Problem , 2020, IEEE Access.

[3]  Habib Fardoun,et al.  Optimization of quality measures in association rule mining: an empirical study , 2018, Int. J. Comput. Intell. Syst..

[4]  Sebastián Ventura,et al.  Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules , 2011, Knowledge and Information Systems.

[5]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[6]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[7]  Philippe Fournier-Viger,et al.  A survey of itemset mining , 2017, WIREs Data Mining Knowl. Discov..

[8]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[9]  Hamido Fujita,et al.  Efficient algorithms to identify periodic patterns in multiple sequences , 2019, Inf. Sci..

[10]  Mohamed A. El-Sharkawi,et al.  Modern Heuristic Optimization Techniques , 2008 .

[11]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[12]  Sebastián Ventura,et al.  Frequent itemset mining: A 25 years review , 2019, WIREs Data Mining Knowl. Discov..

[13]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[14]  José María Luna Pattern mining: current status and emerging topics , 2016, Progress in Artificial Intelligence.

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  Franco Turini,et al.  Data mining for discrimination discovery , 2010, TKDD.

[17]  Jian Chen,et al.  Efficiently Mining Frequent Itemsets on Massive Data , 2019, IEEE Access.

[18]  V. Venkata Ramana,et al.  Methods for Mining Cross Level Association Rule In Taxonomy Data Structures , 2010 .

[19]  Lili Yu,et al.  Improved multi-level association rule in mining algorithm based on a multidimensional data cube , 2013, 2013 3rd International Conference on Consumer Electronics, Communications and Networks.

[20]  Amit Anil Nanavati,et al.  Mining generalised disjunctive association rules , 2001, CIKM '01.

[21]  Jiawei Han,et al.  Mining Multiple-Level Association Rules in Large Databases , 1999, IEEE Trans. Knowl. Data Eng..

[22]  Pablo Moscato,et al.  Disclosed: An efficient depth-first, top-down algorithm for mining disjunctive closed itemsets in high-dimensional data , 2014, Inf. Sci..

[23]  Sebastián Ventura,et al.  Supervised Descriptive Pattern Mining , 2018, Springer International Publishing.

[24]  Sebastián Ventura,et al.  Discovering useful patterns from multiple instance data , 2016, Inf. Sci..

[25]  Pratima Gautam,et al.  Algorithm for Efficient Multilevel Association Rule Mining , 2010 .

[26]  Jiawei Han,et al.  Data-Driven Discovery of Quantitative Rules in Relational Databases , 1993, IEEE Trans. Knowl. Data Eng..