An Investigation of Covid-19 Papers for a Content-Based Recommendation System

The proliferation of scientific publications is a well-known phenomenon that was recently emphasized by the publications related to the Covid-19. The number of publications Covid-19 related that PubMed added in the period between January 17 and April 18, 2020 kept rising until it reached a number of 300 publications added in a single day. There are obvious issues related to this phenomenon, such as the difficulty for researchers to find papers strongly related to their applications. When searching for related papers, there could be issues with how a paper is preferred with respect to another. A paper could be recommended based on the greater number of citations, or on the connections between authors, that is as well related to the number of citations. For such reasons, the aim of this study is to build a recommendation system based exclusively on the abstracts of these publications. We provide a comparison between classical approaches—NLP-based such as TF-IDF and n-grams—and Deep Learning approaches for content-based recommendation systems, such as Transformers. We also provide an application to graphs that shows the relationships among related papers on the basis of the results obtained from the recommendation system developed. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.