Cross-Domain Sentiment Encoding through Stochastic Word Embedding

Sentiment analysis is an important topic concerning identification of feelings, attitudes, emotions and opinions from text. To automate such analysis, a large amount of example text needs to be manually annotated for model training. This is laborious and expensive, but the cross-domain technique is a key solution to reducing the cost by reusing annotated reviews across domains. However, its success largely relies on the learning of a robust common representation space across domains. In the recent years, significant effort has been invested to improve the cross-domain representation learning by designing increasingly more complex and elaborate model inputs and architectures. We support that it is not necessary to increase design complexity as this inevitably consumes more time in model training. Instead, we propose to explore the word polarity and occurrence information through a simple mapping and encode such information more accurately whilst managing lower computational costs. The proposed approach is unique and takes advantage of the stochastic embedding technique to tackle cross-domain sentiment alignment. Its effectiveness is benchmarked with over ten data tasks constructed from two review corpora and it is compared against ten classical and state-of-the-art methods.

[1]  John Blitzer,et al.  Domain adaptation of natural language processing systems , 2008 .

[2]  Harith Alani,et al.  Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification , 2011, ACL.

[3]  Virgílio A. F. Almeida,et al.  From bias to opinion: a transfer-learning approach to real-time sentiment analysis , 2011, KDD.

[4]  Hsin-Hsi Chen,et al.  Opinion Extraction, Summarization and Tracking in News and Blog Corpora , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[5]  Philip S. Yu,et al.  Visual Domain Adaptation with Manifold Embedded Distribution Alignment , 2018, ACM Multimedia.

[6]  Songbo Tan,et al.  A survey on sentiment detection of reviews , 2009, Expert Syst. Appl..

[7]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[8]  Qiang Yang,et al.  Cross-domain sentiment classification via spectral feature alignment , 2010, WWW '10.

[9]  Jianfei Yu,et al.  Learning Sentence Embeddings with Auxiliary Tasks for Cross-Domain Sentiment Classification , 2016, EMNLP.

[10]  Ken-ichi Kawarabayashi,et al.  Unsupervised Cross-Domain Word Representation Learning , 2015, ACL.

[11]  Misha Denil,et al.  From Group to Individual Labels Using Deep Features , 2015, KDD.

[12]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[13]  Qiang Yang,et al.  Cross-Domain Co-Extraction of Sentiment and Topic Lexicons , 2012, ACL.

[14]  Danushka Bollegala,et al.  A comparative study of pivot selection strategies for unsupervised cross-domain sentiment classification , 2018, The Knowledge Engineering Review.

[15]  Haizhou Li,et al.  A cross-domain adaptation method for sentiment classification using probabilistic latent analysis , 2011, CIKM '11.

[16]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[17]  Yi Yang,et al.  Learning to Identify Review Spam , 2011, IJCAI.

[18]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[19]  Houbing Song,et al.  SentiRelated: A cross-domain sentiment classification algorithm for short texts through sentiment related index , 2018, J. Netw. Comput. Appl..

[20]  Philip S. Yu,et al.  Deep Learning of Transferable Representation for Scalable Domain Adaptation , 2016, IEEE Transactions on Knowledge and Data Engineering.

[21]  Tingting Mu,et al.  Adaptive Data Embedding Framework for Multiclass Classification , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[23]  Wei Yang,et al.  A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings , 2017, EMNLP.

[24]  Yu Zhang,et al.  End-to-End Adversarial Memory Network for Cross-domain Sentiment Classification , 2017, IJCAI.

[25]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[26]  Yu Zhang,et al.  Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification , 2018, AAAI.

[27]  Salwani Abdullah,et al.  Approaches to Cross-Domain Sentiment Analysis: A Systematic Literature Review , 2017, IEEE Access.

[28]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[29]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[30]  Danushka Bollegala,et al.  Cross-Domain Sentiment Classification Using Sentiment Sensitive Embeddings , 2016, IEEE Transactions on Knowledge and Data Engineering.

[31]  Meng Wang,et al.  Stochastic Multiview Hashing for Large-Scale Near-Duplicate Video Retrieval , 2017, IEEE Transactions on Multimedia.

[32]  Haoran Xie,et al.  Cross-Domain Sentiment Classification via Topic-Related TrAdaBoost , 2017, AAAI.

[33]  Danushka Bollegala,et al.  Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus , 2013, IEEE Transactions on Knowledge and Data Engineering.

[34]  Sheng Wang,et al.  SUIT: A Supervised User-Item Based Topic Model for Sentiment Analysis , 2014, AAAI.

[35]  Li Cheng,et al.  Semi-supervised Domain Adaptation on Manifolds , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Xueqi Cheng,et al.  TASC:Topic-Adaptive Sentiment Classification on Dynamic Tweets , 2015, IEEE Transactions on Knowledge and Data Engineering.

[37]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[38]  Chia-Hui Chang,et al.  Sentiment-oriented contextual advertising , 2009, Knowledge and Information Systems.

[39]  Meng Wang,et al.  Unsupervised t-Distributed Video Hashing and Its Deep Hashing Extension , 2017, IEEE Transactions on Image Processing.

[40]  Erik Cambria,et al.  Affective Computing and Sentiment Analysis , 2016, IEEE Intelligent Systems.

[41]  Tingting Mu,et al.  Data Visualization with Structural Control of Global Cohort and Local Data Neighborhoods , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[43]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[44]  Xuhui Chen,et al.  Cross-Domain Sentiment Classification via a Bifurcated-LSTM , 2018, PAKDD.

[45]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.