Can your paper evade the editors axe? Towards an AI assisted peer review system

This work is an exploratory study of how we could progress a step towards an AI assisted peer- review system. The proposed approach is an ambitious attempt to automate the Desk-Rejection phenomenon prevalent in academic peer review. In this investigation we first attempt to decipher the possible reasons of rejection of a scientific manuscript from the editors desk. To seek a solution to those causes, we combine a flair of information extraction techniques, clustering, citation analysis to finally formulate a supervised solution to the identified problems. The projected approach integrates two important aspects of rejection: i) a paper being rejected because of out of scope and ii) a paper rejected due to poor quality. We extract several features to quantify the quality of a paper and the degree of in-scope exploring keyword search, citation analysis, reputations of authors and affiliations, similarity with respect to accepted papers. The features are then fed to standard machine learning based classifiers to develop an automated system. On a decent set of test data our generic approach yields promising results across 3 different journals. The study inherently exhibits the possibility of a redefined interest of the research community on the study of rejected papers and inculcates a drive towards an automated peer review system.

[1]  Oren Etzioni,et al.  Identifying Meaningful Citations , 2015, AAAI Workshop: Scholarly Big Data.

[2]  Michael Schroeder,et al.  GoPubMed: exploring PubMed with the Gene Ontology , 2005, Nucleic Acids Res..

[3]  Mounir Errami,et al.  eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications , 2007, Nucleic Acids Res..

[4]  Nick Cramer,et al.  Automatic Keyword Extraction from Individual Documents , 2010 .

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[7]  Nitesh V. Chawla,et al.  Will This Paper Increase Your h-index?: Scientific Impact Prediction , 2014, WSDM.

[8]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[9]  Claus-Wilhelm von der Lieth,et al.  PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts , 2005, Nucleic Acids Res..

[10]  Min Song,et al.  Exploring characteristics of highly cited authors according to citation location and content , 2017, J. Assoc. Inf. Sci. Technol..

[11]  Martijn J. Schuemie,et al.  Jane: suggesting journals, finding experts , 2008, Bioinform..

[12]  Susan T. Dumais,et al.  Predicting Citation Counts Using Text and Graph Mining , 2013 .

[13]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[14]  Niloy Ganguly,et al.  Influence of Reviewer Interaction Network on Long-Term Citations: A Case Study of the Scientific Peer-Review System of the Journal of High Energy Physics , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[15]  James Caverlee,et al.  PageRank for ranking authors in co-citation networks , 2009 .

[16]  Yan Zhang,et al.  To better stand on the shoulder of giants , 2012, JCDL '12.

[17]  Daniel Lemire,et al.  Measuring academic influence: Not all citations are equal , 2015, J. Assoc. Inf. Sci. Technol..

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  P. Rousseeuw,et al.  Partitioning Around Medoids (Program PAM) , 2008 .

[20]  Alfred D. Eaton,et al.  HubMed: a web-based biomedical literature search interface , 2006, Nucleic Acids Res..

[21]  Nitesh V. Chawla,et al.  Can Scientific Impact Be Predicted? , 2016, IEEE Transactions on Big Data.

[22]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[23]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[24]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[25]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[26]  Oren Etzioni,et al.  Learning to Predict Citation-Based Impact Measures , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).