Towards better understanding of academic search

Academics have relied heavily on search engines to identify and locate research manuscripts that are related to their research areas. Many of the early information retrieval systems and technologies were developed while catering for librarians to help them sift through books and proceedings, followed by recent online academic search engines such as Google Scholar and Microsoft Academic Search. In spite of their popularity among academics and importance to academia, the usage, query behaviors, and retrieval models for academic search engines have not been well studied. To this end, we study the distribution of queries that are received by an academic search engine. Furthermore, we delve deeper into academic search queries and classify them into navigational and informational queries. This work introduces a definition for navigational queries in academic search engines under which a query is considered navigational if the user is searching for a specific paper or document. We describe multiple facets of navigational academic queries, and introduce a machine learning approach with a set of features to identify such queries.

[1]  W. Bruce Croft,et al.  The History of Information Retrieval Research , 2012, Proceedings of the IEEE.

[2]  Madian Khabsa,et al.  Digital commons , 2020, Internet Policy Rev..

[3]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[4]  Xin Li,et al.  Coupling feature selection and machine learning methods for navigational query identification , 2006, CIKM '06.

[5]  References , 1971 .

[6]  Wang-Chien Lee,et al.  Personalized ranking for digital libraries based on log analysis , 2008, WIDM '08.

[7]  Amanda Spink,et al.  Determining the user intent of web search engine queries , 2007, WWW '07.

[8]  Min-Yen Kan,et al.  Detecting and supporting known item queries in online public access catalogs , 2005, Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '05).

[9]  Zhenyu Liu,et al.  Automatic identification of user goals in Web search , 2005, WWW '05.

[10]  Ricardo A. Baeza-Yates,et al.  The Intention Behind Web Queries , 2006, SPIRE.

[11]  Xiao Li,et al.  Learning query intent from regularized click graphs , 2008, SIGIR '08.

[12]  Christy Caldwell,et al.  Shifting Sands: Science Researchers on Google Scholar, Web of Science, and PubMed, with Implications for Library Collections Budgets. , 2010 .

[13]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[14]  In-Ho Kang,et al.  Query type classification for web document retrieval , 2003, SIGIR.

[15]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.