Emotion Recognition for Japanese Short Sentences Including Slangs Based on Bag of Concepts Feature Trained by Large Web Text

The growth of Internet communication sites such as weblogs and social networking sites brought younger people especially in teens and in their 20s to create new words and to use them very often. We prepared an emotion corpus by collecting weblog article texts including new words, analyzed the corpus statistically, and proposed a method to estimate emotions of the texts. Most slang words such as Youth Slang are too ambiguous in sense classification to be registered into the existing dictionaries such as thesaurus. To cope with these words, we created a large scale of Twitter corpus and calculated sense similarities between words. We proposed to convert unknown word to semantic class id so that we might be able to process the words that were not included in the learning data. For calculation similarities between words and converting the word into word cluster id, we used the word embedding algorithms such as word2vec, or GloVe. We defined this method as a method using Bag of Concepts as feature. As a result of the evaluation experiment using several classifiers, the proposed method was proved its robustness for unknown expressions.

[1]  Kenji Kita,et al.  Sensibility estimation method for youth slang by using sensibility co-occurrence feature vector obtained from microblog , 2015, 2015 IEEE International Conference on Computer and Communications (ICCC).

[2]  Muhammad Abdul-Mageed,et al.  EmoNet: Fine-Grained Emotion Detection with Gated Recurrent Neural Networks , 2017, ACL.

[3]  Yoshua Bengio,et al.  Gated Feedback Recurrent Neural Networks , 2015, ICML.

[4]  Kenji Kita,et al.  Emotion Estimation from Sentence Using Relation between Japanese Slangs and Emotion Expressions , 2012, PACLIC.

[5]  Kubomura Chiaki,et al.  An Evaluation Method of a Younger's Word Processing System with use of Blog Articles , 2006 .

[6]  Yuji Matsumoto,et al.  Emotion Classification Using Massive Examples Extracted from the Web , 2008, COLING.

[7]  Fuji Ren,et al.  Semi-Automatic Creation of Youth Slang Corpus and Its Application to Affective Computing , 2016, IEEE Transactions on Affective Computing.

[8]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[9]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[10]  Kenji Kita,et al.  Emotion recognition for sentences with unknown expressions based on semantic similarity by using Bag of Concepts , 2015, 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[11]  Kentaro Torisawa,et al.  A Look inside the Distributionally Similar Terms , 2010 .

[12]  Masaki Murata,et al.  Large Scale Relation Acquisition Using Class Dependent Patterns , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[13]  Naoaki Okazaki,et al.  A Simple and Fast Algorithm for Approximate String Matching with Set Similarity , 2011 .

[14]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[15]  Janez Demšar,et al.  Emotion Recognition on Twitter: Comparative Study and Training a Unison Model , 2020, IEEE Transactions on Affective Computing.

[16]  Toshinobu Harada,et al.  Younger's Word and their Processing Method , 2002 .

[17]  Peter D. Turney A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations , 2008, COLING.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[20]  Fuji Ren,et al.  Construction of Wakamono Kotoba Emotion Dictionary and Its Application , 2011, CICLing.

[21]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[22]  Shingo Kuroiwa,et al.  EVALUATION OF EMOTION ESTIMATION METHODS BASED ON STATISTIC FEATURES OF EMOTION TAGGED CORPUS , 2008 .

[23]  Kazuyuki Matsumoto,et al.  Emotion Analysis on Social Big Data , 2020 .

[24]  Kenji Kita,et al.  Emotion Recognition of Emoticons Based on Character Embedding , 2017, J. Softw..

[25]  George Karypis,et al.  Comparison of Agglomerative and Partitional Document Clustering Algorithms , 2002 .