Hybrid neural networks for social emotion detection over short text

Short text is prevalent on the Web, but it brings challenges to content analysis methods for the lack of contextual information. Biterm topic model (BTM) is a variant of latent Dirichlet allocation, which effectively infers the latent topic distribution of short text by modeling the generation of biterms in the whole corpus. However, it needs fine-tuning from labels to reduce noise when applied to supervised learning. Motivated by the transfer learning approach, we propose the hybrid neural networks based on BTM and conventional neural networks, which first make the hidden layer of neural networks approximate the inference of BTM. Following this initial pre-training phase, we then use the simple back-propagation algorithm to fine-tune the topic distribution learned from BTM, so as to improve the performance of supervised learning. Our experiment on two diverse collections of short text validates the effectiveness of the proposed hybrid neural networks for social emotion detection.

[1]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[2]  T. Feuring,et al.  Learning in fuzzy neural networks , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[3]  Long Zhu,et al.  A Hybrid Neural Network-Latent Topic Model , 2012, AISTATS.

[4]  Carlo Strapparava,et al.  SemEval-2007 Task 14: Affective Text , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[5]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6]  Thomas Hofmann,et al.  Greedy Layer-Wise Training of Deep Networks , 2007 .

[7]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Gerald Tesauro,et al.  Practical issues in temporal difference learning , 1992, Machine Learning.

[9]  Mohamed Morchid,et al.  Topic-space based setup of a neural network for theme identification of highly imperfect transcriptions , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).

[10]  Rong Yan,et al.  Mining Social Emotions from Affective Text , 2012, IEEE Transactions on Knowledge and Data Engineering.

[11]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[12]  Matt Taddy,et al.  On Estimation and Selection for Topic Models , 2011, AISTATS.

[13]  Yanghui Rao,et al.  Sentiment topic models for social emotion mining , 2014, Inf. Sci..

[14]  K. Scherer,et al.  Evidence for universality and cultural variation of differential emotion response patterning. , 1994, Journal of personality and social psychology.

[15]  Xin Li,et al.  Social Emotion Classification via Reader Perspective Weighted Model , 2016, AAAI.

[16]  Thangairulappan Kathirvalavakumar,et al.  A New Weight Initialization Method Using Cauchy’s Inequality Based on Sensitivity Analysis , 2011 .

[17]  Robert P. W. Duin,et al.  Neural network initialization by combined classifiers , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[18]  Bo Zhao,et al.  PET: a statistical model for popular events tracking in social communities , 2010, KDD.

[19]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[20]  Yan Zhang,et al.  User Based Aggregation for Biterm Topic Model , 2015, ACL.

[21]  Wenyin Liu,et al.  Affective topic model for social emotion detection , 2014, Neural Networks.

[22]  Bernard Widrow,et al.  Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[23]  Richard Wicentowski,et al.  SWAT-MP:The SemEval-2007 Systems for Task 5 and Task 14 , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[24]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[25]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[26]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[27]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[28]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[29]  Marco Wiering,et al.  2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) , 2011, IJCNN 2011.

[30]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[31]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[32]  Naonori Ueda,et al.  Modeling Noisy Annotated Data with Application to Social Annotation , 2013, IEEE Transactions on Knowledge and Data Engineering.

[33]  Xin Li,et al.  Weighted multi-label classification model for sentiment analysis of online news , 2016, 2016 International Conference on Big Data and Smart Computing (BigComp).

[34]  Jude W. Shavlik,et al.  Interpretation of Artificial Neural Networks: Mapping Knowledge-Based Neural Networks into Rules , 1991, NIPS.

[35]  Jiafeng Guo,et al.  BTM: Topic Modeling over Short Texts , 2014, IEEE Transactions on Knowledge and Data Engineering.