Tourism activity recognition and discovery based on improved LDA model

LDA (Latent Dirichlet Allocation) model is a kind of unsupervised learning model which can extract the hidden topic from text in recent years. In this paper, we proposed a novel LDA model based on the traditional LDA model, which is integrated into the information of text category (Activity-topic LDA). In this paper, the Activity-topic LDA is proposed to improve the original latent Dirichlet allocation (LDA) model. On the basis of the LDA, the proposed method adds the tourism activity information, and obtains the probability distribution model of the tourism activities. Based on this model, we can identify and discover the theme of tourism activities.

[1]  Houkuan Huang,et al.  A hierarchical symptom-herb topic model for analyzing traditional Chinese medicine clinical diabetic data , 2010, 2010 3rd International Conference on Biomedical Engineering and Informatics.

[2]  Ping Jian,et al.  Self-adaptive topic model: A solution to the problem of “rich topics get richer” , 2014, China Communications.

[3]  Susan T. Dumais,et al.  Partially labeled topic models for interpretable text mining , 2011, KDD.

[4]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[5]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[6]  Guangbing Yang,et al.  Using Contextual Topic Model for a Query-Focused Multi-Document Summarizer , 2016, Int. J. Artif. Intell. Tools.

[7]  Jianling Sun,et al.  Large scale microblog mining using distributed MB-LDA , 2012, WWW.

[8]  Susan T. Dumais,et al.  Characterizing Microblogs with Topic Models , 2010, ICWSM.

[9]  Kathleen McKeown,et al.  A Hierarchical Model of Web Summaries , 2011, ACL.

[10]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[11]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[12]  Fabrizio Sebastiani,et al.  Machine Learning in Automated Text Categorization: a Bibliography , 2003 .

[13]  Frank D. Wood,et al.  Hierarchically Supervised Latent Dirichlet Allocation , 2011, NIPS.

[14]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[15]  William Speier,et al.  Evaluating topic model interpretability from a primary care physician perspective , 2016, Comput. Methods Programs Biomed..

[16]  Mengen Chen,et al.  Short Text Classification Improved by Learning Multi-Granularity Topics , 2011, IJCAI.

[17]  Tieniu Tan,et al.  Relevance Topic Model for Unstructured Social Group Activity Recognition , 2013, NIPS.

[18]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models for regression and classification , 2009, ICML '09.

[19]  Fernando Díaz-de-María,et al.  A region-centered topic model for object discovery and category-based image segmentation , 2013, Pattern Recognit..

[20]  Eugene Agichtein,et al.  TM-LDA: efficient online modeling of latent topic transitions in social media , 2012, KDD.

[21]  Paul Hofmarcher,et al.  MODEL TREES WITH TOPIC MODEL PREPROCESSING: AN APPROACH FOR DATA JOURNALISM ILLUSTRATED WITH THE WIKILEAKS AFGHANISTAN WAR LOGS , 2013 .

[22]  Ramesh Nallapati,et al.  Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora , 2009, EMNLP.

[23]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[24]  Fei-Fei Li,et al.  Large Margin Learning of Upstream Scene Understanding Models , 2010, NIPS.

[25]  Marius Pasca,et al.  Latent Variable Models of Concept-Attribute Attachment , 2009, ACL/IJCNLP.

[26]  Yang Zhang,et al.  Modeling user posting behavior on social media , 2012, SIGIR '12.