Opinion mining for thai restaurant reviews using K-Means clustering and MRF feature selection

Opinion mining on millions of Thai restaurant reviews in an unsupervised manner is a challenging task to survey feedbacks of the customers on their products and services. This is extremely helpful for owners to improve their business. In this paper, we propose an opinion mining on Thai restaurant reviews using K-Means clustering and MRF feature selection. The proposed method begins with text preprocessing for breaking reviews into words and removing stop words, followed by text transformation for creating keywords and generating input vectors. MRF feature selection is subsequently adopted for selecting relevant features from a large number of features extracted. Then, K-Means is employed for clustering into positive and negative reviews. From the experimental results, MRF feature selection can efficiently reduce the number of features in the data set so the computational time is significantly decreased. In addition, K-means can achieve the best clustering performance, when compared with Self-Organizing Map, Fuzzy C-Means, and Hierarchical Clustering. Thus, the cooperation of K-means with MRF feature selection is an effective model for clustering Thai restaurant reviews.

[1]  Troudi Ahmed,et al.  New Allied Fuzzy C-Means algorithm for Takagi-Sugeno fuzzy model identification , 2013, 2013 International Conference on Electrical Engineering and Software Applications.

[2]  Nor Ashidi Mat Isa,et al.  Adaptive fuzzy-K-means clustering algorithm for image segmentation , 2010, IEEE Transactions on Consumer Electronics.

[3]  Ruifeng Xu,et al.  Corse-fine opinion mining , 2009, 2009 International Conference on Machine Learning and Cybernetics.

[4]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[5]  Hsinchun Chen,et al.  AI and Opinion Mining , 2010, IEEE Intelligent Systems.

[6]  Marimuthu Palaniswami,et al.  Fuzzy c-Means Algorithms for Very Large Data , 2012, IEEE Transactions on Fuzzy Systems.

[7]  Mikolaj Morzy,et al.  Opinion Mining and Social Networks: A Promising Match , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[8]  Peng Jiang,et al.  An Approach Based on Tree Kernels for Opinion Mining of Online Product Reviews , 2010, 2010 IEEE International Conference on Data Mining.

[9]  Hong Wang,et al.  K-means clustering with manifold , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[10]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[11]  Xiangji Huang,et al.  Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain , 2012, IEEE Transactions on Knowledge and Data Engineering.

[12]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[13]  A. A. Sheibani Opinion mining and opinion spam: A literature review focusing on product reviews , 2012, 6th International Symposium on Telecommunications (IST).

[14]  Qiang Cheng,et al.  The Fisher-Markov Selector: Fast Selecting Maximally Separable Feature Subset for Multiclass Classification with Applications to High-Dimensional Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  B. Eswara Reddy,et al.  A fast approximate kernel k-means clustering method for large data sets , 2011, 2011 IEEE Recent Advances in Intelligent Computational Systems.