Luppar: An Information Retrieval System for Closed Document Collections

This article presents Luppar, an Information Retrieval tool for closed collections of documents which uses a local distributional semantic model associated to each corpus. The system performs automatic query expansion using a combination of distributional semantic model and local context analysis and supports relevancy feedback. The performance of the system was evaluated in databases of different domains and presented results equal to or higher than those published in the literature.

[1]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[2]  Hongwu Qin,et al.  A survey of query expansion, query suggestion and query refinement techniques , 2015, 2015 4th International Conference on Software Engineering and Computer Systems (ICSECS).

[3]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[4]  Hsin-Hsi Chen,et al.  Query Expansion with ConceptNet and WordNet: An Intrinsic Comparison , 2006, AIRS.

[5]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[6]  Ronan Collobert,et al.  Rehabilitation of Count-Based Models for Word Vector Representations , 2015, CICLing.

[7]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[8]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[9]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[10]  James R. Curran,et al.  Improvements in Automatic Thesaurus Extraction , 2002, ACL 2002.

[11]  Josiane Mothe,et al.  Query Expansion by Local Context Analysis , 2016, CORIA-CIFED.

[12]  W. Lowe,et al.  Towards a Theory of Semantic Space , 2001 .

[13]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[14]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[15]  Charles L. A. Clarke,et al.  Overview of the TREC 2012 Contextual Suggestion Track , 2013, TREC.

[16]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[17]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[18]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[19]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[20]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[21]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[22]  Zhiguo Gong,et al.  Web Query Expansion by WordNet , 2005, DEXA.

[23]  David Lo,et al.  Query expansion via WordNet for effective code search , 2015, 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER).

[24]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[25]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.