Fuzzy Clustering for Semi-supervised Learning - Case Study: Construction of an Emotion Lexicon

We consider the task of semi-supervised classification: extending category labels from a small dataset of labeled examples to a much larger set. We show that, at least on our case study task, unsupervised fuzzy clustering of the unlabeled examples helps in obtaining the hard clusters. Namely, we used the membership values obtained with fuzzy clustering as additional features for hard clustering. We also used these membership values to reduce the confusion set for the hard clustering. As a case study, we use applied the proposed method to the task of constructing a large emotion lexicon by extending the emotion labels from the WordNet Affect lexicon using various features of words. Some of the features were extracted from the emotional statements of the freely available ISEAR dataset; other features were WordNet distance and the similarity measured via the polarity scores in the SenticNet resource. The proposed method classified words by emotion labels with high accuracy.

[1]  Xiaoou Li,et al.  Support Vector Machine Classification Based on Fuzzy Clustering for Large Data Sets , 2006, MICAI.

[2]  Mehmet A. Orgun,et al.  AI 2007: Advances in Artificial Intelligence, 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, December 2-6, 2007, Proceedings , 2007, Australian Conference on Artificial Intelligence.

[3]  Peter D. Turney,et al.  Emotions Evoked by Common Words and Phrases: Using Mechanical Turk to Create an Emotion Lexicon , 2010, HLT-NAACL 2010.

[4]  Maite Taboada,et al.  Not All Words Are Created Equal: Extracting Semantic Orientation as a Function of Adjective Relevance , 2007, Australian Conference on Artificial Intelligence.

[5]  Dipankar Das,et al.  Enriching SenticNet Polarity Scores through Semi-Supervised Fuzzy Clustering , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[6]  Michael L. Littman,et al.  Measuring praise and criticism: Inference of semantic orientation from association , 2003, TOIS.

[7]  Mitsuru Ishizuka,et al.  SentiFul: Generating a reliable lexicon for sentiment analysis , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[8]  Dipankar Das,et al.  Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining , 2013, IEEE Intelligent Systems.

[9]  Marco Baroni,et al.  Identifying subjective adjectives through web-based mutual information , 2004 .

[10]  Rada Mihalcea,et al.  A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources , 2008, LREC.

[11]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[12]  Bing Liu Sentiment Analysis , 2020 .

[13]  Ellen Riloff,et al.  Learning subjective nouns using extraction pattern bootstrapping , 2003, CoNLL.

[14]  M. de Rijke,et al.  UvA-DARE ( Digital Academic Repository ) Using WordNet to measure semantic orientations of adjectives , 2004 .

[15]  Carlo Strapparava,et al.  WordNet Affect: an Affective Extension of WordNet , 2004, LREC.

[16]  Latifur Khan,et al.  An effective support vector machines (SVMs) performance using hierarchical clustering , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[17]  K. Scherer What are emotions? And how can they be measured? , 2005 .

[18]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[19]  Erik Cambria,et al.  Sentic Computing: Techniques, Tools, and Applications , 2012 .

[20]  Sidorov Grigori,et al.  Automatic Emotional Personality Description using Linguistic Data , 2006 .

[21]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[22]  Daniel Boley,et al.  Training Support Vector Machines Using Adaptive Clustering , 2004, SDM.

[23]  Alexander Gelbukh,et al.  MICAI 2006: Advances in Artificial Intelligence, 5th Mexican International Conference on Artificial Intelligence, Apizaco, Mexico, November 13-17, 2006, Proceedings , 2006, MICAI.

[24]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[25]  Jonathon Read,et al.  Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification , 2005, ACL.

[26]  Rada Mihalcea,et al.  Word Sense and Subjectivity , 2006, ACL.

[27]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[28]  Sabine Bergler,et al.  CLaC and CLaC-NB: Knowledge-based and corpus-based approaches to sentiment tagging , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[29]  Takashi Inui,et al.  Extracting Semantic Orientations of Words using Spin Model , 2005, ACL.

[30]  C. Elliott The affective reasoner: a process model of emotions in a multi-agent system , 1992 .

[31]  Erik Cambria,et al.  Merging SenticNet and WordNet-Affect emotion lists for sentiment analysis , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[32]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[33]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[34]  Hsinchun Chen,et al.  AI and Opinion Mining , 2010, IEEE Intelligent Systems.

[35]  Erik Cambria,et al.  SenticNet: A Publicly Available Semantic Resource for Opinion Mining , 2010, AAAI Fall Symposium: Commonsense Knowledge.

[36]  Cecilia Ovesdotter Alm,et al.  Emotions from Text: Machine Learning for Text-based Emotion Prediction , 2005, HLT.

[37]  C. Strapparava,et al.  The Color of Emotions in Texts , 2010 .

[38]  Jiawei Han,et al.  Classifying large data sets using SVMs with hierarchical clusters , 2003, KDD '03.