The prediction of research disciplines has gained increasing attention in recent years due to its potential implementations in a variety of fields, such as academic advising, career counseling, and academic research funding allocation. Research information systems storing projects (meta) data play a crucial role in managing and evaluating research (meta) data across different disciplines and fields of study. In this context, research projects are manually assigned one or more research disciplines to facilitate this process. This is usually done by research administrators due to the limited time the principal researchers themselves might have. In addition to being rather subjective and time-consuming, this can lead to inconsistencies in discipline assignments and hence impact the quality of data used for monitoring and reporting. To address these limitations various approaches have been proposed, in the literature, to predict disciplines associated with research documents, e.g., publications, and projects. The frequently used methods in bibliometrics are bibliographic coupling, co-citation, and direct citation [1]. These approaches used citation network analysis techniques to determine the disciplines related to a publication. More recently, machine learning techniques have been applied to classify research documents [2]–[4]. In these approaches, the publications’ abstracts were used as features to predict related disciplines. Machine learning techniques have been demonstrated to perform better than traditional approaches in bibliometrics. Although these approaches are useful, they may not be applicable to research information systems that lack citation data or have low-quality abstracts. In this paper, we propose a novel approach to predict the disciplines of research projects in a research information system. The proposed approach uses machine learning algorithms and extracted disciplines from researchers and their related information such as organizations, projects, co-authors on projects, publications, and co-authors on publications. By analyzing the disciplines from these resources, the proposed model can predict each project’s most appropriate research disciplines, providing a more objective and consistent approach to discipline assignment. This approach is helpful when there are no citation data or high-quality abstracts available. In the following sections, we describe the development and evaluation of our model, including the data sources and methods used to train the machine learning algorithms, as well as the performance metrics used to evaluate the accuracy and effectiveness of the proposed approach.
[1]
Éric Archambault,et al.
Article-level classification of scientific publications: A comparison of deep learning, direct citation and bibliographic coupling
,
2021,
PloS one.
[2]
Dieter Kranzlmüller,et al.
Using supervised learning to classify metadata of research data by field of study
,
2020,
Quantitative Science Studies.
[3]
K. Boyack,et al.
Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?
,
2010,
J. Assoc. Inf. Sci. Technol..
[4]
S. Hochreiter,et al.
Long Short-Term Memory
,
1997,
Neural Computation.
[5]
Sadia Vancauwenbergh,et al.
The Flemish Research Discipline Classification Standard: A Practical Approach
,
2019,
KNOWLEDGE ORGANIZATION.
[6]
Tim C. E. Engels,et al.
Article Level Classification of Publications in Sociology: An Experimental Assessment of Supervised Machine Learning Approaches
,
2019,
ISSI.