An NLP-based citation reason analysis using CCRO

In recent scientific advances, Artificial Intelligence and Natural Language Processing are the major contributors to classifying documents and extracting information. Classifying citations in different classes have gathered a lot of attention due to the large volume of citations available in different digital libraries. Typical citation classification uses sentiment analysis, where various techniques are applied to citations texts to mainly classify them in “Positive”, “Negative” and “Neutral” sentiments. However, there can be innumerable reasons why an author selects another research for citation. Citations’ Context and Reasons Ontology—CCRO uses a clear scientific method to articulate eight basic reasons for citing by using an iterative process of sentiment analysis, collaborative meanings, and experts' opinions. Using CCRO, this research paper adopts an ontology-based approach to extract citation's reasons and instantiate ontology classes and properties on two different corpora of citation sentences. One corpus of citation sentences is a publicly available dataset, while the other is our own manually curated. The process uses a two-step approach. The first part is an interface to manually annotate each citation text in the selected corpora on CCRO properties. A team of carefully selected annotators has annotated each citation to achieve a high inter-annotator agreement. The second part focuses on the automatic extraction of these reasons. Using Natural Language Processing, Mapping Graph, and Reporting Verb in a citation sentence, citation's reason is extracted and mapped onto a CCRO property. After comparing both manual and automatic mapping, accuracy is calculated. Based on experiments and results, accuracy is calculated for both publicly available and own corpora of citation sentences.

[1]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[2]  Henry G. Small,et al.  Interpreting maps of science using citation context sentiments: a preliminary investigation , 2011, Scientometrics.

[3]  M. Moravcsik,et al.  Some Results on the Function and Quality of Citations , 1975 .

[4]  E. Garfield When to Cite , 1996, The Library Quarterly.

[5]  Stephen Cranefield,et al.  Context identification of sentences in related work sections using a conditional random field: towards intelligent digital libraries , 2010, JCDL '10.

[6]  Neville Ryant,et al.  A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.

[7]  Steve Woolgar,et al.  Essay Review: The Quantitative Study of Science: an Examination of the Literature , 1974 .

[8]  Manasi S. Patwardhan,et al.  Context based citation summary of research articles: A step towards qualitative citation index , 2015, 2015 International Conference on Computer, Communication and Control (IC4).

[9]  Xiaojun Wan,et al.  Are all literature citations equally important? Automatic citation strength estimation and its applications , 2014, J. Assoc. Inf. Sci. Technol..

[10]  L. Vinet,et al.  A ‘missing’ family of classical orthogonal polynomials , 2010, 1011.1669.

[11]  Niket Tandon,et al.  Citation Context Sentiment Analysis for Structured Summarization of Research Papers , 2012 .

[12]  K. Hyland,et al.  Hedging in scientific research articles , 1998 .

[13]  Noorizah Mohd Noor,et al.  Analysis of Reporting Verbs in Master's Theses☆ , 2014 .

[14]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[15]  Muhammad Tanvir Afzal,et al.  Identification of important citations by exploiting research articles’ metadata and cue-terms from content , 2018, Scientometrics.

[16]  Anne Zribi-Hertz,et al.  Les Pronoms: Morphologie, Syntaxe Et typologie , 1999 .

[17]  Hinrich Schütze,et al.  Towards a Generic and Flexible Citation Classifier Based on a Faceted Classification Scheme , 2012, COLING.

[18]  Simone Teufel,et al.  An annotation scheme for citation function , 2009, SIGDIAL Workshop.

[19]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[20]  Virginia Wilson Research Methods: Bibliometrics , 2012 .

[21]  Oren Etzioni,et al.  Identifying Meaningful Citations , 2015, AAAI Workshop: Scholarly Big Data.

[22]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[23]  Dragomir R. Radev,et al.  The ACL anthology network corpus , 2009, Language Resources and Evaluation.

[24]  Bei Yu,et al.  Automated citation sentiment analysis: What can we learn from biomedical researchers , 2013, ASIST.

[25]  Blaise Cronin,et al.  The Need for a Theory of citing , 1981, J. Documentation.

[26]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[27]  Umut Al,et al.  A content-based citation analysis study based on text categorization , 2017, Scientometrics.

[28]  Simone Teufel,et al.  Automatic classification of citation function , 2006, EMNLP.

[29]  Silvio Peroni,et al.  FaBiO and CiTO: Ontologies for describing bibliographic resources and citations , 2012, J. Web Semant..

[30]  Jeffrey D. Camm From the Editor: Changes at Interfaces , 2009, Interfaces.

[31]  Charles Oppenheim,et al.  Do citations matter? , 1994, J. Inf. Sci..

[32]  Yifan He,et al.  Towards Fine-grained Citation Function Classification , 2013, RANLP.

[33]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[34]  Awais Athar,et al.  Sentiment Analysis of Citations using Sentence Structure-Based Features , 2011, ACL.

[35]  G. Thompson,et al.  Evaluation in the Reporting Verbs Used in Academic Papers. , 1991 .

[36]  Yaoyun Zhang,et al.  Citation Sentiment Analysis in Clinical Trial Papers , 2015, AMIA.

[37]  Angelo Di Iorio,et al.  Evaluating Citation Functions in CiTO: Cognitive Issues , 2014, ESWC.

[38]  Imran Ihsan,et al.  VerbNet based Citation Sentiment Class Assignment using Machine Learning , 2020 .

[39]  C. Lee Giles,et al.  ParsCit: an Open-source CRF Reference String Parsing Package , 2008, LREC.

[40]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[41]  Ulrich Schäfer,et al.  Ensemble-style Self-training on Citation Classification , 2011, IJCNLP.

[42]  Simone Teufel,et al.  Argumentative zoning information extraction from scientific text , 1999 .

[43]  In-Cheol Kim,et al.  Automated classification of author's sentiments in citation using machine learning techniques: A preliminary study , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[44]  Muhammad Abdul Qadir,et al.  CCRO: Citation’s Context & Reasons Ontology , 2019, IEEE Access.

[45]  Maggie Charles,et al.  Phraseological patterns in reporting clauses used in citation: A corpus-based study of theses in two disciplines , 2006 .

[46]  Muhammad Rafi,et al.  Classification of Research Citations (CRC) , 2015, CLBib@ISSI.

[47]  Gilbert Shapiro,et al.  Toward the Integration of Content Analysis and General Methodology , 1975 .

[48]  José M. Gómez,et al.  Survey in sentiment, polarity and function analysis of citation , 2014, ArgMining@ACL.