论文信息 - What Were People Searching For? A Query Log Analysis of An Academic Search Engine

What Were People Searching For? A Query Log Analysis of An Academic Search Engine

Academic search engines have served the research community for years, yet there is little work done on understanding the taxonomy of query semantics. In this work, we present our findings of analyzing the query log of an academic search engine in the past four years. We study the distribution of query intents to understand the information requested by users. We classify query strings by topics using shallow and latent features captured using a customized word embedding model. To this end, we create a dataset that has scientific keywords and titles labeled with fields of study. This dataset is later used to train a classifier that discriminates query logs by topics. Our work will help to train better learning-based ranking functions that improve user experiences for an academic search engine. In addition, we anonymize our 14,759,852 query logs and make them available to the research community for further exploration.

C. Lee Giles | Shaurya Rohatgi | Shaurya Rohatgi

[1] Elena Paslaru Bontas Simperl,et al. A Query Log Analysis of Dataset Search , 2017, ICWE.

[2] C. Lee Giles,et al. CiteSeer: an automatic citation indexing system , 1998, DL '98.

[3] Query Understanding for Search Engines , 2020, The Information Retrieval Series.

[4] Elmer V. Bernstam,et al. A day in the life of PubMed: analysis of a typical day's query log. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[5] Cornelia Caragea,et al. CiteSeerX: AI in a Digital Library Search Engine , 2014, AI Mag..

[6] Cornelia Caragea,et al. Cleaning Noisy and Heterogeneous Metadata for Record Linking Across Scholarly Big Datasets , 2019, AAAI.

[7] Jian Wu,et al. Large Scale Subject Category Classification of Scholarly Papers With Deep Attentive Neural Networks , 2020, Frontiers in Research Metrics and Analytics.

[8] Doug Downey,et al. SPECTER: Document-level Representation Learning using Citation-informed Transformers , 2020, ACL.

[9] Zhaohui Wu,et al. Towards better understanding of academic search , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).