Incorporating Document Keyphrases in Search Results

Effectiveness and efficiency of searching and returned results presentation is the key to a search engine. Before downloading and examining the document text, users usually first judge the relevance of a return hit to the query by looking at document metadata presented in the return result. However, the metadata coming with the return hit is usually not rich enough for users to predict the content of the document. Keyphrases provide a concise summary of a document’s content, offering subject metadata characterizing and summarizing document. In this paper, we propose a mechanism of enriching the metadata of the return results by incorporating automatically extracted document keyphrases in each return hit. By looking at the keyphrases in each return hit, the user can predict the content of the document more easily, quickly, and accurately. The experimental results show that our solution may save users time up to 32% and users would like to use our proposed search interface with document keyphrases as part of the metadata of a return hit.

[1]  Oren Etzioni,et al.  Grouper: A Dynamic Clustering Interface to Web Search Results , 1999, Comput. Networks.

[2]  Dana J. Vanier,et al.  Use of Keyphrase Extraction Software for Creation of an AEC/FM Thesaurus , 2000, J. Inf. Technol. Constr..

[3]  W. Bruce Croft,et al.  The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[4]  Ian H. Witten,et al.  A User Evaluation of Hierarchical Phrase Browsing , 2003, ECDL.

[5]  Shivakumar Vaithyanathan,et al.  Exploiting clustering and phrases for context-based information retrieval , 1997, SIGIR '97.

[6]  Carl Gutwin,et al.  Improving browsing in digital libraries with keyphrase indexes , 1999, Decis. Support Syst..

[7]  Nina Wacholder,et al.  Automatic identification and organization of index terms for interactive browsing , 2001, JCDL '01.

[8]  Ian H. Witten Browsing around a digital library , 2003, SODA '03.

[9]  Jan O. Pedersen,et al.  Snippet Search: a Single Phrase Approach to Text Access , 1991 .

[10]  Sung-Hyon Myaeng,et al.  TIPSTER Panel - DR-LINK's Linguistic-Conceptual Approach to Document Detection , 1992, TREC.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Yi-fang Brook Wu,et al.  KIP: a keyphrase identification program with learning functions , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[14]  Steve Jones Design and Evaluation of Phrasier, an Interactive System for Linking Documents Using Keyphrases , 1999, INTERACT.

[15]  Joel L. Fagan The effectiveness of a nonsyntatic approach to automatic phrase indexing for document retrieval , 1989 .

[16]  George Buchanan,et al.  Scalable browsing for large collections: a case study , 2000, DL '00.

[17]  Peter D. Turney Learning Algorithms for Keyphrase Extraction , 2000, Information Retrieval.

[18]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[19]  Leah S. Larkey,et al.  A patent search and classification system , 1999, DL '99.

[20]  Joel L. Fagan,et al.  The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval , 1989, JASIS.

[21]  Alan F. Smeaton,et al.  User-Chosen Phrases in Interactive Query Formulation for Information Retrieval , 1998, BCS-IRSG Annual Colloquium on IR Research.

[22]  Peter G. Anick,et al.  The paraphrase search assistant: terminological feedback for iterative information seeking , 1999, SIGIR '99.

[23]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[24]  Charles Møller,et al.  Proceedings of the Tenth Americas Conference on Information Systems , 2004 .