Connecting wikis and natural language processing systems

We investigate the integration of Wiki systems with automated natural language processing (NLP) techniques. The vision is that of a "self-aware" Wiki system reading, understanding, transforming, and writing its own content, as well as supporting its users in information analysis and content development. We provide a number of practical application examples, including index generation, question answering, and automatic summarization, which demonstrate the practicability and usefulness of this idea. A system architecture providing the integration is presented, as well as first results from an initial implementation based on the GATE framework for NLP and the MediaWiki system.

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[3]  Ralf Krestel,et al.  Generating Update Summaries for DUC 2007 , 2007, HLT-NAACL 2007.

[4]  Max Mühlhäuser,et al.  Analyzing and accessing Wikipedia as a lexical semantic resource , 2007 .

[5]  Atanas Kiryakov,et al.  Semantic Annotation, Indexing, and Retrieval , 2003, SEMWEB.

[6]  René Witte,et al.  Ontology Design for Biomedical Text Mining , 2007 .

[7]  Ralf Krestel,et al.  {Using Knowledge-poor Coreference Resolution for Text Summarization} , 2003 .

[8]  René Witte,et al.  Multi-ERSS and ERSS 2004 , 2004 .

[9]  Kalina Bontcheva,et al.  Evolving GATE to meet new challenges in language engineering , 2004, Natural Language Engineering.

[10]  Hamish Cunningham,et al.  GATE-a General Architecture for Text Engineering , 1996, COLING.

[11]  Diana Maynard,et al.  Populating a Database from Parallel Texts Using Ontology-Based Information Extraction , 2004, NLDB.

[12]  Ronen Feldman,et al.  Book Reviews: The Text Mining Handbook: Advanced Approaches to Analyzing Unstructured Data by Ronen Feldman and James Sanger , 2008, CL.

[13]  Atanas Kiryakov,et al.  Semantic annotation, indexing, and retrieval , 2004, J. Web Semant..

[14]  Bayle Shanks WikiGateway: a library for interoperability and accelerated wiki development , 2005, Int. Sym. Wikis.

[15]  René Witte,et al.  Next-Generation Summarization: Contrastive, Focused, and Update Summaries , 2007 .

[16]  Ralf Krestel,et al.  ERSS 2005: Coreference-Based Summarization Reloaded , 2005 .

[17]  René Witte,et al.  Fuzzy Clustering for Topic Analysis and Summarization of Document Collections , 2007, Canadian Conference on AI.

[18]  Ralf Krestel,et al.  {An Integration Architecture for User-Centric Document Creation, Retrieval, and Analysis} , 2004, VLDB 2004.

[19]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[20]  Ralf Krestel,et al.  Context-based Multi-Document Summarization using Fuzzy Coreference Cluster Graphs , 2006 .

[21]  René Witte,et al.  Combining Biological Databases and Text Mining to Support New Bioinformatics Applications , 2005, NLDB.

[22]  René Witte,et al.  A Self-Learning Context-Aware Lemmatizer for German , 2005, HLT.

[23]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[24]  Sebastian Schaffert,et al.  IkeWiki: A Semantic Wiki for Collaborative Knowledge Management , 2006, 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE'06).

[25]  Ralf Krestel,et al.  Engineering a Semantic Desktop for Building Historians and Architects , 2005, Semantic Desktop Workshop.