Leveraging unstructured information using topic modelling

Unstructured information in the form of natural language text is abundant in various kinds of organisations. To increase information sharing, organisational learning, decision-making and productivity, large amounts of unstructured text need to be analysed on a daily basis. Full text searching alone is not sufficient as a first approach to help users understand what a collection of electronic documents is about, since it does not provide the user with an overview of the underlying concepts in the document collection.

[1]  William G. Holliday Modeling in Science. , 2001 .

[2]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[3]  Andrew McCallum,et al.  Mining a digital library for influential authors , 2007, JCDL '07.

[4]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[5]  Andrew McCallum,et al.  Expertise modeling for matching papers with reviewers , 2007, KDD '07.

[6]  Dunja Mladenic,et al.  Visualization of Text Document Corpus , 2005, Informatica.

[7]  Wei Li,et al.  Nonparametric Bayes Pachinko Allocation , 2007, UAI.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  John D. Lafferty,et al.  A correlated topic model of Science , 2007, 0708.3601.

[10]  T. Griffiths,et al.  Probabilistic inference in human semantic memory , 2006, Trends in Cognitive Sciences.

[11]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[12]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[13]  Y. Wang,et al.  A multi-facet taxonomy system with applications in unstructured knowledge management , 2005, J. Knowl. Manag..

[14]  Wei Li,et al.  Mixtures of hierarchical topics with Pachinko allocation , 2007, ICML '07.

[15]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[16]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[17]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[18]  Tetsuya Nasukawa,et al.  Text analysis and knowledge mining system , 2001, IBM Syst. J..

[19]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .