Rapid Tagging and Reporting for Functional Language Extraction in Scientific Articles

This paper describes the development of a web-based application for tagging scientific articles, in part to create machine learning training datasets for automated functional language identification and extraction (AFLEX). The initial intent for this work was to provide a new member of the ecosystem of tools that facilitate the structured automation of systematic reviews, an area of work that typically requires critical analysis of multiple research studies and provides an exhaustive summary of literature related to a research question. However, the tool's modular interface allows use across disciplines. A user may upload PDF or text documents and quickly tag selected parts of the document with a customizable set of discipline-specific tags, and export results to CSV or JSON formats. An integrated back-end database stores tagging data for comparison between taggers or visual display of results on the web browser. While other discipline-specific text tagging tools exist, the authors have not encountered a cloud-based customizable tool for PDF and text annotation as flexible as the AFLEX Tag Tool developed by the authors.

[1]  Edward Curry,et al.  XBRL and open data for global financial ecosystems: A linked data approach , 2012, Int. J. Account. Inf. Syst..

[2]  Byron C. Wallace,et al.  RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials , 2015, J. Am. Medical Informatics Assoc..

[3]  Laurel D. Riek,et al.  Callisto: A Configurable Annotation Workbench , 2004, LREC.

[4]  D. Sackett,et al.  Cochrane Collaboration , 1994, BMJ.

[5]  B. Kramer,et al.  Colorectal cancer prevention and fishful thinking. , 2015, Journal of the National Cancer Institute.

[6]  Agile Manifesto,et al.  Manifesto for Agile Software Development , 2001 .

[7]  J. Higgins Cochrane handbook for systematic reviews of interventions. Version 5.1.0 [updated March 2011]. The Cochrane Collaboration , 2011 .

[8]  Horacio Saggion,et al.  Dr. Inventor Framework: Extracting Structured Information from Scientific Publications , 2015, Discovery Science.

[9]  J. Rafols,et al.  Therapeutic effect of tPA in ischemic stroke is enhanced by its combination with normobaric oxygen and hypothermia or ethanol , 2015, Brain Research.

[10]  Joanna Wolfe Annotation Technologies: A Software and Research Review. , 2002 .

[11]  Enrico Motta,et al.  Automatic Classification of Springer Nature Proceedings with Smart Topic Miner , 2016, SEMWEB.

[12]  P. Glasziou,et al.  Systematic review automation technologies , 2014, Systematic Reviews.

[13]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.