Three-way recommender systems based on random forests

We propose a framework integrating three-way decision and random forests.We introduce a new recommender action to consult the user for the choice.We build a random forest to predict the probability that a user likes an item.The three-way thresholds are optimal for both the training set and the testing set. Recommender systems attempt to guide users in decisions related to choosing items based on inferences about their personal opinions. Most existing systems implicitly assume the underlying classification is binary, that is, a candidate item is either recommended or not. Here we propose an alternate framework that integrates three-way decision and random forests to build recommender systems. First, we consider both misclassification cost and teacher cost. The former is paid for wrong recommender behaviors, while the latter is paid to actively consult the user for his or her preferences. With these costs, a three-way decision model is built, and rational settings for positive and negative threshold values α* and β* are computed. We next construct a random forest to compute the probability P that a user will like an item. Finally, α * , 0.35 e m 0 e x β * , and P are used to determine the recommender's behavior. The performance of the recommender is evaluated on the basis of an average cost. Experimental results on the well-known MovieLens data set show that the (α*, β*)-pair determined by three-way decision is optimal not only on the training set, but also on the testing set.

[1]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[2]  Pasquale Lops,et al.  Content-based Recommender Systems: State of the Art and Trends , 2011, Recommender Systems Handbook.

[3]  Simon Fong,et al.  A Hybrid GA-based Collaborative Filtering Model for Online Recommenders , 2007, ICE-B.

[4]  William Nick Street,et al.  Healthcare information systems: data mining methods in the creation of a clinical recommender system , 2011, Enterp. Inf. Syst..

[5]  Zhenmin Tang,et al.  Minimum cost attribute reduction in decision-theoretic rough set models , 2013, Inf. Sci..

[6]  Peter D. Turney Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm , 1994, J. Artif. Intell. Res..

[7]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[8]  Yiyu Yao,et al.  A Note on Attribute Reduction in the Decision-Theoretic Rough Set Model , 2008, RSCTC.

[9]  Yiyu Yao,et al.  A Three-Way Decision Approach to Email Spam Filtering , 2010, Canadian Conference on AI.

[10]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[11]  J. Bobadilla,et al.  Recommender systems survey , 2013, Knowl. Based Syst..

[12]  Dumitru Baleanu,et al.  Local Fractional Discrete Wavelet Transform for Solving Signals on Cantor Sets , 2013 .

[13]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[14]  Qiang Yang,et al.  Test-cost sensitive classification on data with missing values , 2006, IEEE Transactions on Knowledge and Data Engineering.

[15]  Tong-Jun Li,et al.  An axiomatic characterization of probabilistic rough sets , 2014, Int. J. Approx. Reason..

[16]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[17]  Vincent S. Tseng,et al.  Personalized rough-set-based recommendation by integrating multiple contents and collaborative information , 2010, Inf. Sci..

[18]  Bamshad Mobasher,et al.  Model-Based Collaborative Filtering as a Defense against Profile Injection Attacks , 2006, AAAI.

[19]  P. Tueting,et al.  Auditory evoked potentials, clinical vs. research applications , 1997, Psychiatry Research.

[20]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[21]  Wojciech Ziarko,et al.  Variable Precision Rough Set Model , 1993, J. Comput. Syst. Sci..

[22]  Fan Min,et al.  A hierarchical model for test-cost-sensitive decision systems , 2009, Inf. Sci..

[23]  William Zhu,et al.  Parametric Rough Sets with Application to Granular Association Rule Mining , 2013 .

[24]  D. Mossman Three-way ROCs , 1999, Medical decision making : an international journal of the Society for Medical Decision Making.

[25]  Yiyu Yao,et al.  Attribute reduction in decision-theoretic rough set models , 2008, Inf. Sci..

[26]  Tianrui Li,et al.  THREE-WAY GOVERNMENT DECISION ANALYSIS WITH DECISION-THEORETIC ROUGH SETS , 2012 .

[27]  Hong Zhao,et al.  Research on Face Recognition Based on Embedded System , 2013 .

[28]  Michael J. Pazzani,et al.  A Framework for Collaborative, Content-Based and Demographic Filtering , 1999, Artificial Intelligence Review.

[29]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[30]  G. Xiao,et al.  Erratum to “Characterization of Human Colorectal Cancer MDR1/P-gp Fab Antibody” , 2014, The Scientific World Journal.

[31]  Yiyu Yao,et al.  Three-way decisions with probabilistic rough sets , 2010, Inf. Sci..

[32]  Jingtao Yao,et al.  Game-Theoretic Risk Analysis in Decision-Theoretic Rough Sets , 2008, RSKT.

[33]  William Zhu,et al.  Mining Significant Granular Association Rules for Diverse Recommendation , 2014, RSCTC.

[34]  Jing-Yu Yang,et al.  Dominance-based rough set approach and knowledge reductions in incomplete ordered information system , 2008, Inf. Sci..

[35]  Nouman Azam,et al.  Game-theoretic rough sets for recommender systems , 2014, Knowl. Based Syst..

[36]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[37]  Dun Liu,et al.  A Multiple-category Classification Approach with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[38]  Fan Min,et al.  Aggregated Recommendation through Random Forests , 2014, TheScientificWorldJournal.

[39]  Jerzy W. Grzymala-Busse,et al.  Generalized probabilistic approximations of incomplete data , 2014, Int. J. Approx. Reason..

[40]  William Zhu,et al.  Comparison of Discretization Approaches for Granular Association Rule Mining , 2014, Canadian Journal of Electrical and Computer Engineering.

[41]  Hendrik Drachsler,et al.  Personal recommender systems for learners in lifelong learning networks: the requirements, techniques and model , 2008, Int. J. Learn. Technol..

[42]  Fei-Yue Wang,et al.  Reduction and axiomization of covering generalized rough sets , 2003, Inf. Sci..

[43]  Joseph P. Herbert,et al.  Criteria for choosing a rough set model , 2009, Comput. Math. Appl..

[44]  Nouman Azam,et al.  Multiple Criteria Decision Analysis with Game-Theoretic Rough Sets , 2012, RSKT.

[45]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[46]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[47]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[48]  Hong Yu,et al.  A Three-Way Decisions Clustering Algorithm for Incomplete Data , 2014, RSKT.

[49]  Huaxiong Li,et al.  Risk Decision Making Based on Decision-theoretic Rough Set: A Three-way View Decision Model , 2011 .

[50]  Fabio Airoldi,et al.  Hybrid algorithms for recommending new items , 2011, HetRec '11.

[51]  Yiyu Yao,et al.  Three-Way Decision: An Interpretation of Rules in Rough Set Theory , 2009, RSKT.

[52]  Qinghua Hu,et al.  Feature selection with test cost constraint , 2012, ArXiv.

[53]  Jingtao Yao,et al.  Modelling Multi-agent Three-way Decisions with Decision-theoretic Rough Sets , 2012, Fundam. Informaticae.

[54]  Yiyu Yao,et al.  Three-way Investment Decisions with Decision-theoretic Rough Sets , 2011, Int. J. Comput. Intell. Syst..

[55]  Yiyu Yao,et al.  Probabilistic rough set approximations , 2008, Int. J. Approx. Reason..

[56]  Chang Wan,et al.  Test-Cost Sensitive Classification on Data with Missing Values in the Limited Time , 2010, KES.

[57]  Z. Pawlak Rough Sets: Theoretical Aspects of Reasoning about Data , 1991 .

[58]  Z. Pawlak,et al.  Rough membership functions , 1994 .

[59]  Wei-Zhi Wu,et al.  Generalized fuzzy rough sets , 2003, Inf. Sci..

[60]  Luis A. Sarabia,et al.  Three-way models and detection capability of a gas chromatography–mass spectrometry method for the determination of clenbuterol in several biological matrices: the 2002/657/EC European Decision , 2004 .

[61]  J. Scott Armstrong,et al.  Estimating nonresponse bias in mail surveys. , 1977 .

[62]  Shichao Zhang,et al.  Cost-sensitive classification with respect to waiting cost , 2010, Knowl. Based Syst..

[63]  Yoon Ho Cho,et al.  A personalized recommender system based on web usage mining and decision tree induction , 2002, Expert Syst. Appl..

[64]  Jingtao Yao,et al.  Decision-theoretic rough sets and beyond , 2014, International Journal of Approximate Reasoning.

[65]  Hong Zhao,et al.  Test-cost-sensitive attribute reduction of data with normal distribution measurement errors , 2012, ArXiv.

[66]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[67]  William Zhu,et al.  Optimal Sub-Reducts with Test Cost Constraint , 2011, RSKT.

[68]  Nouman Azam,et al.  Analyzing uncertainties of probabilistic rough set regions with game-theoretic rough sets , 2014, Int. J. Approx. Reason..

[69]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[70]  Ziwen Luo,et al.  Phytochemical Profiles and Antioxidant and Antimicrobial Activities of the Leaves of Zanthoxylum bungeanum , 2014, TheScientificWorldJournal.

[71]  Fan Min,et al.  A Random Forest Approach to Model-based Recommendation ⋆ , 2014 .

[72]  Fabio Roli,et al.  Cost-sensitive Learning in Support Vector Machines , 2002 .

[73]  Zhi-Hua Zhou,et al.  Ieee Transactions on Knowledge and Data Engineering 1 Training Cost-sensitive Neural Networks with Methods Addressing the Class Imbalance Problem , 2022 .

[74]  Yuhua Qian,et al.  Test-cost-sensitive attribute reduction , 2011, Inf. Sci..

[75]  Ron Kohavi,et al.  Data mining tasks and methods: Classification: decision-tree discovery , 2002 .

[76]  William Zhu,et al.  A comparative study of discretization approaches for granular association rule mining , 2013, 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[77]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[78]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[79]  Yiyu Yao,et al.  The superiority of three-way decisions in probabilistic rough set models , 2011, Inf. Sci..