Interpreting Fine-Grained Categories from Natural Language Queries of Entity Search

The fine-grained target categories/types are very critical for improving the performance of entity search because they can be used for retrieving relevant entities by filtering irrelevant entities with a high confidence. However, most solutions of entity search face an urgent problem, i.e., the lack of fine-grained target categories of queries, which are hard for users to explicitly specify. In this paper, we try to interpret fine-grained categories from natural language based queries of entity search. We observe that entity search queries often contain terms specifying the contexts of the desired entities, as well as a topic of the desired entities. Accordingly, we propose to interpret fine-grained categories of entity search queries from the context perspective and the topic perspective. Therefore, we propose an approach by formalizing both context-based category model and topic-based category model, to tackle the category interpreting task. Extensive experiments on two widely-used test sets: INEX-XER 2009 and SemSearch-LS, indicate significant performance improvement achieved by our proposed method over the state-of-the-art baselines.

[1]  W. Bruce Croft,et al.  Linear feature-based models for information retrieval , 2007, Information Retrieval.

[2]  Gianluca Demartini,et al.  Overview of the INEX 2009 Entity Ranking Track , 2009, INEX.

[3]  M. de Rijke,et al.  Combining Candidate and Document Models for Expert Search , 2008, TREC.

[4]  Zhirui Hu,et al.  Head, modifier, and constraint detection in short texts , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[5]  Xiaoyong Du,et al.  Improving Context and Category Matching for Entity Search , 2014, AAAI.

[6]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[7]  Gianluca Demartini,et al.  Overview of the INEX 2008 Entity Ranking Track , 2009, INEX.

[8]  James A. Thom,et al.  Using Wikipedia Categories and Links in Entity Ranking , 2007, INEX.

[9]  Krisztian Balog,et al.  On Type-Aware Entity Retrieval , 2017, ICTIR.

[10]  Mounia Lalmas,et al.  Overview of the INEX 2007 Entity Ranking Track , 2008, INEX.

[11]  M. de Rijke,et al.  Formal language models for finding groups of experts , 2016, Inf. Process. Manag..

[12]  Peter Mika,et al.  Entity Search Evaluation over Structured Web Data , 2011 .

[13]  Krisztian Balog,et al.  Target Type Identification for Entity-Bearing Queries , 2017, SIGIR.

[14]  Krisztian Balog,et al.  Hierarchical target type identification for entity-oriented queries , 2012, CIKM.

[15]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[16]  M. de Rijke,et al.  Query modeling for entity search based on terms, categories, and examples , 2011, TOIS.

[17]  W. Bruce Croft,et al.  Learning concept importance using a weighted dependence model , 2010, WSDM '10.

[18]  Craig MacDonald,et al.  Voting techniques for expert search , 2008, Knowledge and Information Systems.

[19]  Jaap Kamps,et al.  Exploiting the category structure of Wikipedia for entity ranking , 2013, Artif. Intell..

[20]  Krisztian Balog,et al.  A test collection for entity search in DBpedia , 2013, SIGIR.