PandaSearch: A fine-grained academic search engine for research documents

In the world of academia, research documents enable the sharing and dissemination of scientific discoveries. During these “big data” times, academic search engines are widely used to find the relevant research documents. Considering the domain of computer science, a researcher often inputs a query with a specific goal to find an algorithm or a theorem. However, to this date, the return result of most search engines is just as a list of related papers. Users have to browse the results, download the interesting papers and look for the desired information, which is obviously laborious and inefficient. In this paper, we present a novel academic search system, called PandaSearch, that returns the results with a fine-grained interface, where the results are well organized by different categories, such as definitions, theorems, lemmas, algorithms and figures. The key technical challenges in our system include the automatic identification and extraction of different parts in a research document, the discovery of the main topic phrases for a definition or a theorem, and the recommendation of related definitions or figures to elegantly satisfy the search intention of users. Based on this, we have built a user friendly search interface for users to conveniently explore the documents, and find the relevant information.