Classification of Research Articles Hierarchically: A New Technique

The amount of research work taking place in all streams of Science, Engineering, Medicines, etc., is growing rapidly and hence the research articles are increasing everyday. In this dynamic environment, identifying and maintaining such a large collection of articles in one place and classifying them manually are becoming very exhaustive. Often, the allocation of articles in various subject areas will be made simply on the basis of the journals in which they are published. This paper proposes an approach for handling such huge volume of articles by classifying them into their respective categories based on the keywords extracted from the keyword section of the article. Query enrichment is used by generating unigram and bigram of these keywords and giving them proper weights using probability measure. Microsoft Academic Research dataset is used for the experimental purpose and the empirical results show the effectiveness of the propose approach.

[1]  Francis Narin,et al.  Bibliometric profiles for British academic institutions: An experiment to develop research output indicators , 1988, Scientometrics.

[2]  G. Lewison,et al.  Bibliometric methods for the evaluation of arthritis research. , 1999, Rheumatology.

[3]  David L. Deeds,et al.  An Analysis of the Critical Role of Public Science in Innovation: The Case of Biotechnology , 2000 .

[4]  Francis Narin,et al.  Bibliometric analysis of U.S. pharmaceutical industry research performance , 1988 .

[5]  Grant Lewison,et al.  The classification of biomedical journals by research level , 2004, Scientometrics.

[6]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[7]  Kevin W. Boyack,et al.  Classification of individual articles from all of science by research level , 2014, J. Informetrics.

[8]  Aditi Chattopadhyay,et al.  A hierarchical classification scheme for computationally efficient damage classification , 2009 .

[9]  Robert J. W. Tijssen,et al.  Discarding the ‘basic science-applied science’ dichotomy: A knowledge utilization triangle classification system of research journals , 2010 .

[10]  Grant Lewison,et al.  The effect of funding on the outputs of biomedical research , 2006, Scientometrics.

[11]  G. Lewison,et al.  Mapping the emergence and development of translational cancer research. , 2006, European journal of cancer.

[12]  Gabriel Pinski,et al.  Structure of the Biomedical Literature , 1976, J. Am. Soc. Inf. Sci..

[13]  Georgios Paliouras,et al.  Evaluation measures for hierarchical classification: a unified view and novel approaches , 2013, Data Mining and Knowledge Discovery.

[14]  María Bordons,et al.  Comparison of research team activity in two biomedical fields , 1997, Scientometrics.

[15]  John D. Lafferty,et al.  A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval , 2017, SIGF.