Dynamic Attention Deep Model for Article Recommendation by Learning Human Editors' Demonstration

As aggregators, online news portals face great challenges in continuously selecting a pool of candidate articles to be shown to their users. Typically, those candidate articles are recommended manually by platform editors from a much larger pool of articles aggregated from multiple sources. Such a hand-pick process is labor intensive and time-consuming. In this paper, we study the editor article selection behavior and propose a learning by demonstration system to automatically select a subset of articles from the large pool. Our data analysis shows that (i) editors' selection criteria are non-explicit, which are less based only on the keywords or topics, but more depend on the quality and attractiveness of the writing from the candidate article, which is hard to capture based on traditional bag-of-words article representation. And (ii) editors' article selection behaviors are dynamic: articles with different data distribution come into the pool everyday and the editors' preference varies, which are driven by some underlying periodic or occasional patterns. To address such problems, we propose a meta-attention model across multiple deep neural nets to (i) automatically catch the editors' underlying selection criteria via the automatic representation learning of each article and its interaction with the meta data and (ii) adaptively capture the change of such criteria via a hybrid attention model. The attention model strategically incorporates multiple prediction models, which are trained in previous days. The system has been deployed in a commercial article feed platform. A 9-day A/B testing has demonstrated the consistent superiority of our proposed model over several strong baselines.

[1]  Jun Wang,et al.  Optimizing top-n collaborative filtering via dynamic negative item sampling , 2013, SIGIR.

[2]  David M. Blei,et al.  Content-based recommendations with Poisson factorization , 2014, NIPS.

[3]  Hongtao Lu,et al.  Deep CTR Prediction in Display Advertising , 2016, ACM Multimedia.

[4]  Barry Smyth,et al.  Using twitter to recommend real-time topical news , 2009, RecSys '09.

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[7]  Jiahui Liu,et al.  Personalized news recommendation based on click behavior , 2010, IUI '10.

[8]  David Carmel,et al.  Social media recommendation based on people and tags , 2010, SIGIR.

[9]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10]  Jun Wang,et al.  Product-Based Neural Networks for User Response Prediction , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[11]  Andrew McCallum,et al.  Ask the GRU: Multi-task Learning for Deep Text Recommendations , 2016, RecSys.

[12]  J. Bobadilla,et al.  Recommender systems survey , 2013, Knowl. Based Syst..

[13]  Xuanjing Huang,et al.  Retweet Prediction with Attention-based Deep Neural Network , 2016, CIKM.

[14]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[15]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[16]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[17]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[18]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[19]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[20]  Ruslan Salakhutdinov,et al.  Probabilistic Matrix Factorization , 2007, NIPS.

[21]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[22]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[23]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[24]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[25]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.

[26]  Yann LeCun,et al.  Very Deep Convolutional Networks for Text Classification , 2016, EACL.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Yi-Cheng Ku,et al.  Personalized Content Recommendation and User Satisfaction: Theoretical Synthesis and Empirical Findings , 2006, J. Manag. Inf. Syst..

[29]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[30]  Seong-Bae Park,et al.  A location-based news article recommendation with explicit localized semantic analysis , 2013, SIGIR.

[31]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[32]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[33]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[34]  Diyi Yang,et al.  Local implicit feedback mining for music recommendation , 2012, RecSys.

[35]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[36]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[37]  Nianwen Xu,et al.  Chinese Word Segmentation as Character Tagging , 2003, Int. J. Comput. Linguistics Chin. Lang. Process..

[38]  Trevor Cohn,et al.  Non-Linear Text Regression with a Deep Convolutional Neural Network , 2015, ACL.

[39]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[40]  Jürgen Schmidhuber,et al.  Training Very Deep Networks , 2015, NIPS.

[41]  Nianwen Xue,et al.  Chinese Word Segmentation as Character Tagging , 2003, ROCLING/IJCLCLP.

[42]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[43]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[44]  Jun Wang,et al.  Interactive collaborative filtering , 2013, CIKM.

[45]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[46]  W. Bruce Croft,et al.  aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model , 2016, CIKM.

[47]  Jun Wang,et al.  Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[48]  Chiranjib Bhattacharyya,et al.  Content Driven User Profiling for Comment-Worthy Recommendations of News and Blog Articles , 2015, RecSys.

[49]  Bin Xu,et al.  Content Recommendation System Based on Private Dynamic User Profile , 2007, 2007 International Conference on Machine Learning and Cybernetics.

[50]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[51]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[52]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[53]  Xiang Zhang,et al.  Character-level Convolutional Networks for Text Classification , 2015, NIPS.