Mining online reviews in Indonesia's priority tourist destinations using sentiment analysis and text summarization approach

In this modern era, online hotel reviews have a big role considering the hotel is one of the aspects in determining the competitiveness in the tourist area, but its implementation is still rare. Regarding the government's plan to increase tourist arrivals to Indonesia, this research utilized text mining towards online hotel reviews to find useful knowledge in building the hospitality sector as an integral part of the tourism industry. Text classification technique was used to obtain sentiment information contained in review sentences through sentiment analysis, as well as clustering technique as a part of text summarization to find representative sentences that are able to describe the entire contents of the review. The main contribution of this research is to combine two techniques in text mining that have never been done before, namely the sentiment analysis and text summarization. Experiments with hotel reviews in Labuan Bajo and Bali generated surprising outcomes, where the accuracy of classification model reaches 78% and the Davies-Bouldin Index (DBI) of clustering algorithm strikes 0.071. The output of this research is expected to describe the condition of the hotel in the tourist area with a different level of tourism development so that it can contribute to improving the quality of the hotel industry as well as supporting the tourism industry in Indonesia.

[1]  Yen-Liang Chen,et al.  Opinion mining from online hotel reviews - A text summarization approach , 2017, Inf. Process. Manag..

[2]  Kuanchin Chen,et al.  Predicting hotel review helpfulness: The impact of review visibility, and interaction between hotel stars and review ratings , 2016, Int. J. Inf. Manag..

[3]  Daniel T. Larose,et al.  Data Mining and Predictive Analytics , 2015 .

[4]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[5]  Stuart J. Barnes,et al.  Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation , 2017 .

[6]  Cristian Bucur Using Opinion Mining Techniques in Tourism , 2015 .

[7]  Z. Schwartz,et al.  What can big data and text analytics tell us about hotel guest experience and satisfaction , 2015 .

[8]  John Elder,et al.  Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications , 2012 .

[9]  R. M. Chandrasekaran,et al.  A comparative performance evaluation of neural network based approach for sentiment classification of online reviews , 2016, J. King Saud Univ. Comput. Inf. Sci..

[10]  Felipe Bravo-Marquez,et al.  A novel deterministic approach for aspect-based opinion mining in tourism products reviews , 2014, Expert Syst. Appl..

[11]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[12]  I. Blešić,et al.  Business Tourism Destination Competitiveness: A Case of Vojvodina Province (Serbia) , 2012 .

[13]  Rob Law,et al.  Identifying emerging hotel preferences using Emerging Pattern Mining technique , 2015 .

[14]  Ryan Mitchell,et al.  Web Scraping with Python: Collecting Data from the Modern Web , 2015 .

[15]  Preeti Arora,et al.  Analysis of K-Means and K-Medoids Algorithm For Big Data , 2016 .