Affinity Regularized Non-Negative Matrix Factorization for Lifelong Topic Modeling

Lifelong topic model (LTM), an emerging paradigm for never-ending topic learning, aims to yield higher-quality topics as time passes through knowledge accumulated from the past yet learned for the future. In this paper, we propose a novel lifelong topic model based on non-negative matrix factorization (NMF), called Affinity Regularized NMF for LTM (NMF-LTM), which to our best knowledge is distinctive from the popular LDA-based LTMs. NMF-LTM achieves lifelong learning by introducing word-word graph Laplacian as semantic affinity regularization. Other priors such as sparsity, diversity, and between-class affinity are incorporated as well for better performance, and a theoretical guarantee is provided for the algorithmic convergence to a local minimum. Extensive experiments on various public corpora demonstrate the effectiveness of NMF-LTM, particularly its human-like behaviors in two carefully designed learning tasks and the ability in topic modeling of big data. A further exploration of semantic relatedness in knowledge graphs and a case study on a large-scale real-world corpus exhibit the strength of NMF-LTM in discovering high-quality topics in an efficient and robust way.

[1]  Zongben Xu,et al.  $L_{1/2}$ Regularization: A Thresholding Representation Theory and a Fast Solver , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Arjun Mukherjee,et al.  Aspect Extraction with Automated Prior Knowledge Learning , 2014, ACL.

[3]  Qiang Yang,et al.  Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.

[4]  Deqing Wang,et al.  An Improved Regularized Latent Semantic Indexing with L1/2 Regularization and Non-negative Constraints , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[5]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[6]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[7]  Muhammad Taimoor Khan,et al.  Lifelong aspect extraction from big data: knowledge engineering , 2016, Complex Adapt. Syst. Model..

[8]  Susumu Horiguchi,et al.  Learning to classify short and sparse text & web with hidden topics from large-scale data collections , 2008, WWW.

[9]  Xiaojin Zhu,et al.  Incorporating domain knowledge into topic modeling via Dirichlet Forest priors , 2009, ICML '09.

[10]  Zhiyuan Chen,et al.  Lifelong Machine Learning for Topic Modeling and Beyond , 2015, NAACL.

[11]  Pengtao Xie Learning Compact and Effective Distance Metrics with Diversity Regularization , 2015, ECML/PKDD.

[12]  Bing Liu,et al.  Lifelong Learning for Sentiment Classification , 2015, ACL.

[13]  Mingyan Liu,et al.  Online Learning Methods for Networking , 2014, Found. Trends Netw..

[14]  Paul Sajda,et al.  Fast, Exact Model Selection and Permutation Testing for l2-Regularized Logistic Regression , 2012, AISTATS.

[15]  Gregory B. Sorkin,et al.  The Power of Choice in a Generalized Pólya Urn Model , 2008, APPROX-RANDOM.

[16]  Xindong Wu,et al.  Nonnegative Matrix Factorization on Orthogonal Subspace , 2010, Pattern Recognit. Lett..

[17]  Arnaud Doucet,et al.  Generalized Polya Urn for Time-varying Dirichlet Process Mixtures , 2007, UAI.

[18]  Jun Zhu,et al.  Scaling up Dynamic Topic Models , 2016, WWW.

[19]  Xi Chen,et al.  Learning with sparsity: Structures, optimization and applications , 2013 .

[20]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[21]  Sebastian Thrun,et al.  Lifelong robot learning , 1993, Robotics Auton. Syst..

[22]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[23]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[24]  Daniel L. Silver,et al.  Machine Lifelong Learning: Challenges and Benefits for Artificial General Intelligence , 2011, AGI.

[25]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[26]  Marco Saerens,et al.  A time-based collective factorization for topic discovery and monitoring in news , 2014, WWW.

[27]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[28]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[29]  Arjun Mukherjee,et al.  Exploiting Domain Knowledge in Aspect Extraction , 2013, EMNLP.

[30]  Arjun Mukherjee,et al.  Discovering coherent topics using general knowledge , 2013, CIKM.

[31]  Bing Liu,et al.  Mining Aspect-Specific Opinion using a Holistic Lifelong Topic Model , 2016, WWW.

[32]  Vikas Sindhwani,et al.  Learning evolving and emerging topics in social media: a dynamic nmf approach with temporal regularization , 2012, WSDM '12.

[33]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[34]  Bing Liu,et al.  Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data , 2014, ICML.

[35]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[36]  Junjie Wu,et al.  Modeling Emerging, Evolving and Fading Topics Using Dynamic Soft Orthogonal NMF with Sparse Representation , 2015, 2015 IEEE International Conference on Data Mining.

[37]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[38]  Qiang Yang,et al.  Lifelong Machine Learning Test , 2015, AAAI 2015.

[39]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[40]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.