Deep sentiment hashing for text retrieval in social CIoT

Abstract Sentiment-based text retrieval is an urgent and valuable task due to the explosive growth of sentiment-expressed reviews from social networks like Twitter, Facebook, Instagram, etc. Social networks within the domain of Cognitive Internet of Things (CIoT) make it much easier to dynamically discover desirable services and valuable information. Information retrieval in social media is a daunting task which requires a lot of technical insights. As a powerful tool for large-scale information retrieval, hashing techniques have also been extensively employed for text retrieval. However, most existing text hashing methods are impractical for sentiment-expressed text retrieval mainly for three reasons: (1) the text representations are captured by shallow machine learning algorithms; (2) sentiment is rarely considered when measuring the similarity of two documents; and (3) unsupervised learning of hash functions is employed due to the lack of hash labels. To address these problems, in this paper, we put forward a general deep sentiment hashing model, which is composed of three steps. First, a hierarchical attention-based Long Short-Term Memory network (LSTM) is trained to obtain sentiment-specific document representations. Second, given the document embeddings, k-Nearest Neighbor (kNN) algorithm is used to construct a Laplacian matrix which is projected into hash labels via Laplacian Eigenmaps (LapEig) later. Third, we build a deep model for hash functions learning, which is supervised by both the generated hash labels and the original sentiment labels. Such joint supervision ensures that the ultimate hash codes produced by the learned hash functions maintain sentiment-level similarity. Experimental results turn out that the proposed approach achieves an effective and outstanding retrieval performance.

[1]  Wen Gao,et al.  Supervised Distributed Hashing for Large-Scale Multimedia Retrieval , 2018, IEEE Transactions on Multimedia.

[2]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[3]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[4]  Antonio Pescapè,et al.  Integration of Cloud computing and Internet of Things: A survey , 2016, Future Gener. Comput. Syst..

[5]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[9]  Ting Liu,et al.  Learning Semantic Representations of Users and Products for Document Level Sentiment Classification , 2015, ACL.

[10]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[11]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[12]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[13]  Alex Graves,et al.  DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[14]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[17]  Huimin Lu,et al.  Learning unified binary codes for cross-modal retrieval via latent semantic hashing , 2016, Neurocomputing.

[18]  Yi Fang,et al.  Variational Deep Semantic Hashing for Text Documents , 2017, SIGIR.

[19]  Peng Wang,et al.  Convolutional Neural Networks for Text Hashing , 2015, IJCAI.

[20]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  David Suter,et al.  A General Two-Step Approach to Learning-Based Hashing , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Antonio Iera,et al.  The Internet of Things: A survey , 2010, Comput. Networks.

[23]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[24]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[25]  Huimin Lu,et al.  Non-Linear Matrix Completion for Social Image Tagging , 2017, IEEE Access.

[26]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[27]  Lin Yang,et al.  Asymmetric Discrete Graph Hashing , 2017, AAAI.

[28]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[29]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[32]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[33]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[34]  Ling Shao,et al.  Deep Self-Taught Hashing for Image Retrieval , 2019, IEEE Transactions on Cybernetics.

[35]  Zhiyuan Liu,et al.  Neural Sentiment Classification with User and Product Attention , 2016, EMNLP.

[36]  Antonio Torralba,et al.  Multidimensional Spectral Hashing , 2012, ECCV.

[37]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[38]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[39]  Parminder Bhatia,et al.  Better Document-level Sentiment Analysis from RST Discourse Parsing , 2015, EMNLP.

[40]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[41]  Qihui Wu,et al.  Cognitive Internet of Things: A New Paradigm Beyond Connection , 2014, IEEE Internet of Things Journal.

[42]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[43]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Xuanjing Huang,et al.  Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification , 2016, EMNLP.

[45]  Ting Liu,et al.  Document Modeling with Gated Recurrent Neural Network for Sentiment Classification , 2015, EMNLP.