Learning to Recommend via Meta Parameter Partition

In this paper we propose to solve an important problem in recommendation -- user cold start, based on meta leaning method. Previous meta learning approaches finetune all parameters for each new user, which is both computing and storage expensive. In contrast, we divide model parameters into fixed and adaptive parts and develop a two-stage meta learning algorithm to learn them separately. The fixed part, capturing user invariant features, is shared by all users and is learned during offline meta learning stage. The adaptive part, capturing user specific features, is learned during online meta learning stage. By decoupling user invariant parameters from user dependent parameters, the proposed approach is more efficient and storage cheaper than previous methods. It also has potential to deal with catastrophic forgetting while continually adapting for streaming coming users. Experiments on production data demonstrates that the proposed method converges faster and to a better performance than baseline methods. Meta-training without online meta model finetuning increases the AUC from 72.24% to 74.72% (2.48% absolute improvement). Online meta training achieves a further gain of 2.46\% absolute improvement comparing with offline meta training.

[1]  Hugo Larochelle,et al.  A Meta-Learning Perspective on Cold-Start Recommendations for Items , 2017, NIPS.

[2]  Lorien Y. Pratt,et al.  Discriminability-Based Transfer between Neural Networks , 1992, NIPS.

[3]  Jun Wang,et al.  Deep Learning over Multi-field Categorical Data - - A Case Study on User Response Prediction , 2016, ECIR.

[4]  Feng Yu,et al.  A Convolutional Click Prediction Model , 2015, CIKM.

[5]  Marc'Aurelio Ranzato,et al.  Gradient Episodic Memory for Continual Learning , 2017, NIPS.

[6]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[7]  Nicholas Jing Yuan,et al.  DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.

[8]  Elad Hazan,et al.  Logarithmic regret algorithms for online convex optimization , 2006, Machine Learning.

[9]  Richard Socher,et al.  Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting , 2019, ICML.

[10]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  Liang Zhang,et al.  Deep Reinforcement Learning for List-wise Recommendations , 2017, ArXiv.

[13]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[14]  Marcus Rohrbach,et al.  Memory Aware Synapses: Learning what (not) to forget , 2017, ECCV.

[15]  Wang Jun,et al.  Product-Based Neural Networks for User Response Prediction , 2016 .

[16]  Razvan Pascanu,et al.  Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[17]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[18]  Juergen Schmidhuber,et al.  Incremental self-improvement for life-time multi-agent reinforcement learning , 1996 .

[19]  Yuan Qi,et al.  Generative Adversarial User Model for Reinforcement Learning Based Recommendation System , 2018, ICML.

[20]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[21]  Yoshua Bengio,et al.  Gradient based sample selection for online continual learning , 2019, NeurIPS.

[22]  Yoshua Bengio,et al.  Online continual learning with no task boundaries , 2019, ArXiv.

[23]  Sergey Levine,et al.  Online Meta-Learning , 2019, ICML.

[24]  Liang Zhang,et al.  Deep reinforcement learning for page-wise recommendations , 2018, RecSys.

[25]  Tie-Yan Liu,et al.  Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks , 2014, AAAI.

[26]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[27]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Heng-Tze Cheng,et al.  Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.

[29]  Sebastian Thrun,et al.  Lifelong Learning Algorithms , 1998, Learning to Learn.

[30]  Yunming Ye,et al.  DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.

[31]  James Hannan,et al.  4. APPROXIMATION TO RAYES RISK IN REPEATED PLAY , 1958 .

[32]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[33]  Pattie Maes,et al.  Incremental Self-Improvement for Life-Time Multi-Agent Reinforcement Learning , 1996 .

[34]  Marcus Rohrbach,et al.  Selfless Sequential Learning , 2018, ICLR.

[35]  Zhenguo Li,et al.  Federated Meta-Learning for Recommendation , 2018, ArXiv.

[36]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[37]  Fei Chen,et al.  Federated Meta-Learning with Fast Convergence and Efficient Communication , 2018 .

[38]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..