A simple method to extract key terms

Quality Business Intelligence (BI) is widely recognised as critical to sustaining competitive advantage in the dynamic business environment. One of the major challenges to BI is how to efficiently and effectively retrieve business information for analysis to aid decision-making. Information extraction is an important issue in information retrieval (IR). We propose a simple method to extract key terms from electronic documents based on syntactic rules, geographic layout of document, occurrence of terms and co-occurrence of related terms. We also integrated the concept scope of terms into the method, which aids in ranking the extracted terms.

[1]  Dongsong Zhang,et al.  NLPIR: a Theoretical Framework for Applying Natural Language Processing to Information Retrieval , 2003, J. Assoc. Inf. Sci. Technol..

[2]  Kenney Ng A Maximum Likelihood Ratio Information Retrieval Model , 1999, TREC.

[3]  Tomek Strzalkowski,et al.  Robust Text Processing in Automated Information Retrieval , 1994, ANLP.

[4]  San Murugesan,et al.  A dynamic information retrieval system for the Web , 2003, Proceedings 27th Annual International Computer Software and Applications Conference. COMPAC 2003.

[5]  W. Bruce Croft,et al.  Maximum entropy, weight of evidence and information retrieval , 1999 .

[6]  Julio Gonzalo,et al.  Cross-Language Information Access through Phrase Browsing , 2001, NLDB.

[7]  Makoto Nakashima,et al.  Browsing-based Conceptual Information Retrieval Incorporating Dictionary Term Relations, Keyword Association, and a User's Interest , 2003, J. Assoc. Inf. Sci. Technol..

[8]  Karen Sparck Jones What is the Role of NLP in Text Retrieval , 1999 .

[9]  Stan Matwin,et al.  Statistical Phrases in Automated Text Categorization , 2000 .

[10]  Ramana Rao From IR to Search, and Beyond , 2004, ACM Queue.

[11]  W. Bruce Croft,et al.  An exploratory analysis of phrases in text retrieval , 2000, RIAO.

[12]  Peter D. Turney Learning to Extract Keyphrases from Text , 2002, ArXiv.

[13]  Ken Barker,et al.  Using Noun Phrase Heads to Extract Document Keyphrases , 2000, Canadian Conference on AI.

[14]  Robert R. Korfhage,et al.  Information Storage and Retrieval , 1963 .

[15]  Yuen-Hsien Tseng,et al.  Automatic thesaurus generation for Chinese documents , 2002, J. Assoc. Inf. Sci. Technol..

[16]  Jianchang Mao,et al.  Enterprise Search: Tough Stuff , 2004, ACM Queue.

[17]  Christopher C. Yang,et al.  A natural language processing based Internet agent , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[18]  Eric C. Jensen,et al.  Effective Use of Phrases in Language Modeling to Improve Information Retrieval , 2003 .

[19]  W. Bruce Croft,et al.  Improving the effectiveness of information retrieval with local context analysis , 2000, TOIS.

[20]  David Hawking,et al.  How Valuable is External Link Evidence When Searching Enterprise Webs? , 2004, ADC.

[21]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[22]  Doug Beeferman Lexical Discovery with an Enriched Semantic Network , 1998, WordNet@ACL/COLING.

[23]  David Loshin,et al.  Business Intelligence: The Savvy Manager's Guide , 2003 .

[24]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[25]  Cornelis H. A. Koster,et al.  Taming Wild Phrases , 2003, ECIR.

[26]  Victoria McCargar,et al.  Statistical Approaches to Automatic Text Summarization , 2005 .

[27]  Tom M. Mitchell,et al.  Machine Learning and Data Mining , 2012 .

[28]  Peter D. Turney Coherent Keyphrase Extraction via Web Mining , 2003, IJCAI.

[29]  Steven Finch,et al.  Partial orders for document representation: a new methodology for combining document features , 1995, SIGIR '95.

[30]  Devika Subramanian,et al.  Customizing information capture and access , 1997, TOIS.