Comparison of Discretization Approaches for Granular Association Rule Mining

Granular association rule mining is a new relational data mining approach to reveal patterns hidden in multiple tables. Recently, granular association rules have been proposed for cold-start recommendation, where a customer or a product has just entered the system. The current research considers only nominal data. In this paper, we study the impact of discretization approaches on mining semantically richer and stronger rules from numerical data. Specifically, the equal width, the equal frequency, and the k-means approaches are adopted and compared. The setting of interval numbers is a key issue in discretization approaches. Therefore, different settings are compared through experiments on a well-known real life data set. Experimental results show that: 1) discretization is an effective preprocessing technique in mining stronger rules; 2) the appropriate settings of interval numbers are critical to obtaining more rules; 3) the equal frequency approach outperforms the equal width and the k-means approaches; and 4) the recommendation accuracy and the number of recommendations are improved significantly through the discretization approaches.

[1]  William Zhu,et al.  A comparative study of discretization approaches for granular association rule mining , 2013, 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[2]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Yiyu Yao,et al.  Information granulation for web-based information support systems , 2003, SPIE Defense + Commercial Sensing.

[4]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[5]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[6]  William Zhu,et al.  Comparison of Discretization Approaches for Granular Association Rule Mining , 2014, Canadian Journal of Electrical and Computer Engineering.

[7]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[8]  Fei-Yue Wang,et al.  Reduction and axiomization of covering generalized rough sets , 2003, Inf. Sci..

[9]  William Zhu,et al.  Relationship among basic concepts in covering-based rough sets , 2009, Inf. Sci..

[10]  Stephen D. Bay Multivariate discretization of continuous variables for set mining , 2000, KDD '00.

[11]  Guilong Liu,et al.  Rough set theory based on two universal sets and its applications , 2010, Knowl. Based Syst..

[12]  Fan Min,et al.  Dynamic Discretization: A Combination Approach , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[13]  FanMin,et al.  Parametric Rough Sets with Application to Granular Association Rule Mining , 2015 .

[14]  Hui Guo SOAP : Live Recommendations through Social Agents , 1997 .

[15]  Yee Leung,et al.  Granular Computing and Knowledge Reduction in Formal Contexts , 2009, IEEE Transactions on Knowledge and Data Engineering.

[16]  Andrzej Skowron,et al.  Approximation of Relations , 1993, RSKD.

[17]  William Zhu,et al.  Generalized rough sets based on relations , 2007, Inf. Sci..

[18]  Lotfi A. Zadeh,et al.  Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic , 1997, Fuzzy Sets Syst..

[19]  Qinghua Hu,et al.  Granular association rules on two universes with four measures , 2012, ArXiv.

[20]  Andrew K. C. Wong,et al.  Information Discovery through Hierarchical Maximum Entropy Discretization and Synthesis , 1991, Knowledge Discovery in Databases.

[21]  Y. Yao Granular Computing : basic issues and possible solutions , 2000 .

[22]  Yuhua Qian,et al.  Test-cost-sensitive attribute reduction , 2011, Inf. Sci..

[23]  Yiyu Yao,et al.  Two views of the theory of rough sets in finite universes , 1996, Int. J. Approx. Reason..

[24]  Qinghua Hu,et al.  Feature selection with test cost constraint , 2012, ArXiv.

[25]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[26]  Yiyu Yao,et al.  A Granular Computing Paradigm for Concept Learning , 2013 .

[27]  Qinghua Hu,et al.  Granular association rules with four subtypes , 2012, 2012 IEEE International Conference on Granular Computing.

[28]  Fan Min,et al.  Rough sets approach to symbolic value partition , 2008, Int. J. Approx. Reason..

[29]  William Zhu,et al.  Cold-start recommendation through granular association rules , 2013, ArXiv.

[30]  Saso Dzeroski,et al.  Multi-relational data mining: an introduction , 2003, SKDD.

[31]  Jianhua Dai,et al.  Rough 3-valued algebras , 2008, Inf. Sci..

[32]  Jingtao Yao,et al.  Recent developments in granular computing: A bibliometrics study , 2008, 2008 IEEE International Conference on Granular Computing.

[33]  Wen-Xiu Zhang,et al.  Rough fuzzy approximations on two universes of discourse , 2008, Inf. Sci..

[34]  Duoqian Miao,et al.  Rough Group, Rough Subgroup and Their Properties , 2005, RSFDGrC.

[35]  William Zhu,et al.  Granular association rules for multi-valued data , 2013, 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[36]  Y. Yao,et al.  Information Granulation for Web based Information Retrieval Support Systems , 2003 .

[37]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .