Deep Convolutional Neural Network and Multi-view Stacking Ensemble in Ali Mobile Recommendation Algorithm Competition: The Solution to the Winning of Ali Mobile Recommendation Algorithm

We proposed a deep Convolutional Neural Network (CNN) approach and a Multi-View Stacking Ensemble (MVSE) method in Ali Mobile Recommendation Algorithm competition Season 1 and Season 2, respectively. Specifically, we treat the recommendation task as a classical binary classification problem. We thereby designed a large amount of indicative features based on the logic of mobile business, and grouped them into ten clusters according to their properties. In Season 1, a two-dimensional (2D) feature map which covered both time axis and feature cluster axis was created from the original features. This design made it possible for CNN to do predictions based on the information of both short-time actions and long-time behavior habit of mobile users. Combined with some traditional ensemble methods, the CNN achieved good results which ranked No. 2 in Season 1. In Season 2, we proposed a Multi-View Stacking Ensemble (MVSE) method, by using the stacking technique to efficiently combine different views of features. A classifier was trained on each of the ten feature clusters at first. The predictions of the ten classifiers were then used as additional features. Based on the augmented features, an ensemble classifier was trained to generate the final prediction. We continuously updated our model by padding the new stacking features, and finally achieved the performance of F-1 score 8.78% which ranked No. 1 in Season 2, among over 7,000 teams in total.

[1]  Dong Yu,et al.  Scalable stacking and learning for building deep architectures , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Jian Yang,et al.  Sparse Deep Stacking Network for Image Classification , 2015, AAAI.

[3]  Zhaohui Zheng,et al.  Stochastic gradient boosted distributed decision trees , 2009, CIKM.

[4]  Michael R. Lyu,et al.  Effective missing data prediction for collaborative filtering , 2007, SIGIR.

[5]  David H. Wolpert,et al.  An Efficient Method To Estimate Bagging's Generalization Error , 1999, Machine Learning.

[6]  Lior Rokach,et al.  Ensemble-based classifiers , 2010, Artificial Intelligence Review.

[7]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[8]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[11]  Fatos T. Yarman Vural,et al.  A New Fuzzy Stacked Generalization Technique and Analysis of its Performance , 2012, 1204.0171.

[12]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[13]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Padhraic Smyth,et al.  Linearly Combining Density Estimators via Stacking , 1999, Machine Learning.

[16]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[17]  William W. Cohen,et al.  Recommendation as Classification: Using Social and Content-Based Information in Recommendation , 1998, AAAI/IAAI.

[18]  Dong Yu,et al.  Exploring convolutional neural network structures and optimization techniques for speech recognition , 2013, INTERSPEECH.

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[20]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[21]  Bertrand Clarke,et al.  Comparing Bayes Model Averaging and Stacking When Model Approximation Error Cannot be Ignored , 2003, J. Mach. Learn. Res..

[22]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[23]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[24]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Joseph Sill,et al.  Feature-Weighted Linear Stacking , 2009, ArXiv.

[26]  Robin Burke,et al.  Integrating Knowledge-based and Collaborative-filtering Recommender Systems , 2000 .