A Dempster-Shafer Model for Document Retrieval using Noun Phrases

In this paper, we propose a document retrieval system based on natural language processing of documents and queries. We use single terms and term groups as indexing elements to represent documents and queries. The model is formally expressed within the Dempster-Shafer Theory of Evidence. We discuss in detail how we use this theory to represent a document collection, indexing elements, documents and queries. The retrieval function is derived directly from the underlying theory. We then present an implementation of the model. The experimental work carried out is reported last.

[1]  Donna K. Harman,et al.  Ranking Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[2]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[3]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[4]  Andrei Mikheev,et al.  A Workbench for Finding Structure in Texts , 1997, ANLP.

[5]  Alan F. Smeaton,et al.  Natural language processing and information retrieval , 1990, Inf. Process. Manag..

[6]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[7]  Claire Cardie,et al.  An Analysis of Statistical and Syntactic Phrases , 1997, RIAO.

[8]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[9]  Alan F. Smeaton,et al.  Progress in the Application of Natural Language Processing to Information Retrieval Tasks , 1992, Comput. J..