Frequent Patterns Based Word Network: What Can We Obtain from the Tourism Blogs?

In this work, we present a method to extract interesting in- formation for a specific reader from massive tourism blog data. To this end, we first introduce the web crawler tool to obtain blog contents from the web and divide them into semantic word segments. Then, we use the frequent pattern mining method to discover the useful frequent 1- and 2-itemset between words after necessary data cleaning. Third, we visual- ize all the word correlations with a word network. Finally, we propose a local information search method based on the max-confidence measure- ment that enables the blog readers to specify an interesting topic word to find the relevant contents. We illustrate the benefits of this approach by applying it to a Chinese online tourism blog dataset.

[1]  Maria Simi,et al.  Blog Mining Through Opinionated Words , 2006, TREC.

[2]  Alexander Gelbukh,et al.  Computational Linguistics and Intelligent Text Processing , 2015, Lecture Notes in Computer Science.

[3]  Xiangji Huang,et al.  Blog Data Mining: The Predictive Power of Sentiments , 2009 .

[4]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[5]  Hui Xiong,et al.  Hyperclique pattern discovery , 2006, Data Mining and Knowledge Discovery.

[6]  Edward Y. Chang,et al.  Mining blog stories using community-based and temporal clustering , 2006, CIKM '06.

[7]  Nitin Indurkhya,et al.  Handbook of Natural Language Processing , 2010 .

[8]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[9]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[10]  Daniel E. O'Leary,et al.  Blog mining-review and extensions: "From each according to his opinion" , 2011, Decis. Support Syst..

[11]  Sriram Raghavan,et al.  Crawling the Hidden Web , 2001, VLDB.

[12]  Jin Wang,et al.  Towards Knowledge Extraction from Weblogs and Rule-Based Semantic Querying , 2007, RuleML.

[13]  Vibhu O. Mittal,et al.  Comparative Experiments on Sentiment Classification for Online Product Reviews , 2006, AAAI.

[14]  Koichi Takeda,et al.  Information retrieval on the web , 2000, CSUR.

[15]  Fei Wang,et al.  Mining Market Trend from Blog Titles Based on Lexical Semantic Similarity , 2012, CICLing.

[16]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[17]  Qing Cao,et al.  Exploring determinants of voting for the "helpfulness" of online user reviews: A text mining approach , 2011, Decis. Support Syst..

[18]  Timothy W. Finin,et al.  Why we twitter: understanding microblogging usage and communities , 2007, WebKDD/SNA-KDD '07.

[19]  David M. Pennock,et al.  Mining the peanut gallery: opinion extraction and semantic classification of product reviews , 2003, WWW '03.

[20]  Alice H. Oh,et al.  Aspect and sentiment unification model for online review analysis , 2011, WSDM '11.

[21]  Jiawei Han,et al.  Re-examination of interestingness measures in pattern mining: a unified framework , 2010, Data Mining and Knowledge Discovery.

[22]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[23]  J. Crotts,et al.  Travel Blogs and the Implications for Destination Marketing , 2007 .

[24]  Adrian Paschke,et al.  Advances in Rule Interchange and Applications, International Symposium, RuleML 2007, Orlando, Florida, USA, October 25-26, 2007, Proceedings , 2007, RuleML.

[25]  Nicholas Provart,et al.  Correlation networks visualization , 2012, Front. Plant Sci..

[26]  Hassan Abolhassani,et al.  Blog Summarization for Blog Mining , 2009, Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[27]  E O'LearyDaniel Blog mining-review and extensions , 2011 .

[28]  Panagiotis G. Ipeirotis,et al.  Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics , 2010, IEEE Transactions on Knowledge and Data Engineering.