Opinion retrieval through unsupervised topological learning

Opinion Mining is the field of computational study of peopel's emotional behavior expressed in text. The purpose of this article is to introduce a new framework for emotion (opinion) mining based on topological unsupervised learning and hierarchical clustering. In contrast to supervised learning, the problem of clustering characterization in the context of opinion mining based on unsupervised learning is challenging, because label information is not available or not used to guide the learning algorithm. The algorithm described in this paper provides topological clustering of the opionon issued from the tweets, each cluster being associated to a prototype and a weight vector, reflecting the relevance of the data belonging to each clsuter. The proposed framework requires simple computational techniques and are based on the double local weighting self-organizing map (dlw-SOM) model and Hierarchical Clustering. The proposed framework has been used on a real dataset issued from the tweets collected during the 2012 French election compaign.

[1]  Younès Bennani,et al.  Simultaneous Pattern and Variable Weighting during Topological Clustering , 2011, ICONIP.

[2]  Fabio Crestani,et al.  Proximity-based opinion retrieval , 2010, SIGIR '10.

[3]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[4]  Myung-Hoe Huh,et al.  Weighting variables in K-means clustering , 2009 .

[5]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[6]  Hichem Frigui,et al.  Unsupervised learning of prototypes and attribute weights , 2004, Pattern Recognit..

[7]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[8]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[9]  Kam-Fai Wong,et al.  A Unified Graph Model for Sentence-Based Opinion Retrieval , 2010, ACL.

[10]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[12]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[13]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[14]  Pierre Hansen,et al.  Cluster analysis and mathematical programming , 1997, Math. Program..

[15]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[16]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[17]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Mustapha Lebbah,et al.  BeSOM : Bernoulli on Self-Organizing Map , 2007, 2007 International Joint Conference on Neural Networks.

[19]  Mustapha Lebbah,et al.  From variable weighting to cluster characterization in topographic unsupervised learning , 2009, 2009 International Joint Conference on Neural Networks.

[20]  Chieh-Yuan Tsai,et al.  Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm , 2008, Comput. Stat. Data Anal..

[21]  Stan Matwin,et al.  French presidential elections: what are the most efficient measures for tweets? , 2012, PLEAD '12.

[22]  Younès Bennani,et al.  μ-SOM : Weighting features during clustering , 2005 .

[23]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[24]  Ben J. A. Kröse,et al.  Self-organizing mixture models , 2005, Neurocomputing.

[25]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[26]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[27]  Bing Liu,et al.  Opinion observer: analyzing and comparing opinions on the Web , 2005, WWW '05.

[28]  Xuanjing Huang,et al.  A unified relevance model for opinion retrieval , 2009, CIKM.

[29]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.