Recommendation with Generalized Logistic Transformation

Many recommender systems explicitly or implicitly assume that rating data are normally distributed. This assumption is handy, but often does not hold in practice, resulting in system underperformance. In this paper, we design a recommendation algorithm embedding a new distribution model. First, we introduce a generalized logistic transformation and a parameter estimator Minimum Absolute Skewness Estimator (MASE) to obtain generalized-Gaussian distributed data. Second, we propose a new model, namely generalized logit-generalized-normal (GLG-normal) distribution to fit the observed frequency distribution. Finally, we design GLG-N probabilistic matrix factorization (GPMF) recommendation algorithm. Experiments were undertaken on the 3 subsets of Jester. Results show that 1) GLG-normal captures the essence of the frequency distribution, and 2) GPMF is 5% better than PMF in terms of MAE, and significantly outperforms some other schemas.

[1]  S. Wicksell Genetic Theory of Frequency , 1917 .

[2]  Gauss M. Cordeiro,et al.  The Weibull-G Family of Probability Distributions , 2014, Journal of Data Science.

[3]  Jing Li,et al.  Learning Multiple Similarities of Users and Items in Recommender Systems , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[4]  Naohiro Ishii,et al.  Memory-Based Weighted-Majority Prediction for Recommender Systems , 1999, SIGIR 1999.

[5]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[6]  Daniel Scholz Deterministic Global Optimization: Geometric Branch-and-bound Methods and their Applications , 2011 .

[7]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[8]  S. Nadarajah A generalized normal distribution , 2005 .

[9]  Paul Zaetta Very short-term probabilistic forecasting of wind power , 2018 .

[10]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[11]  Qiang Yang,et al.  One-Class Collaborative Filtering , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[12]  N. L. Johnson,et al.  Systems of frequency curves generated by methods of translation. , 1949, Biometrika.

[13]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[14]  Upendra Shardanand Social information filtering for music recommendation , 1994 .

[15]  Patrick Seemann,et al.  Matrix Factorization Techniques for Recommender Systems , 2014 .

[16]  R Mead,et al.  A generalised logit-normal distribution. , 1965, Biometrics.

[17]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[18]  Emrah Altun,et al.  The generalized odd log-logistic family of distributions: properties, regression models and applications , 2017 .

[19]  Kenneth Y. Goldberg,et al.  Eigentaste: A Constant Time Collaborative Filtering Algorithm , 2001, Information Retrieval.

[20]  C. C. Engberg Skew Frequency Curves in Biology and Statistics , .

[21]  M. H. Tahir,et al.  The logistic-X family of distributions and its applications , 2016 .

[22]  Daniel Lemire,et al.  Slope One Predictors for Online Rating-Based Collaborative Filtering , 2007, SDM.

[23]  J. Atchison,et al.  Logistic-normal distributions:Some properties and uses , 1980 .

[24]  P. Pinson,et al.  Very‐short‐term probabilistic forecasting of wind power with generalized logit–normal distributions , 2012 .

[25]  James Bennett,et al.  The Netflix Prize , 2007 .

[26]  Ayman Alzaatreh,et al.  A new method for generating families of continuous distributions , 2013 .

[27]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[28]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[29]  Naoki Abe,et al.  Collaborative Filtering Using Weighted Majority Prediction Algorithms , 1998, ICML.

[30]  Gauss M. Cordeiro,et al.  Generalized Beta-Generated Distributions , 2010, Comput. Stat. Data Anal..

[31]  K. G. Murty,et al.  Convergence of the steepest descent method for minimizing quasiconvex functions , 1996 .

[32]  Kenneth F. Wallis,et al.  TIME SERIES ANALYSIS OF BOUNDED ECONOMIC VARIABLES , 1987 .

[33]  Bing Wu,et al.  A Survey of Collaborative Filtering-Based Recommender Systems for Mobile Internet Applications , 2016, IEEE Access.

[34]  Charu C. Aggarwal,et al.  Kernel-Based Feature Extraction for Collaborative Filtering , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[35]  Fan Min,et al.  Three-way recommender systems based on random forests , 2016, Knowl. Based Syst..

[36]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[37]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[38]  Wei Dai,et al.  Convergence of Gradient Descent for Low-Rank Matrix Approximation , 2015, IEEE Transactions on Information Theory.

[39]  H. White,et al.  On More Robust Estimation of Skewness and Kurtosis: Simulation and Application to the S&P500 Index , 2003 .

[40]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[41]  C. Willmott,et al.  Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance , 2005 .

[42]  B. Frey,et al.  Probabilistic Sparse Matrix Factorization , 2004 .

[43]  Prateek Jain,et al.  Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..

[44]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[45]  F. Y. Edgeworth I.— on the Representation of Statistics by Mathematical Formula). (Part I.) , 1898 .