XIP Dashboard: visual analytics from automated rhetorical parsing of scientific metadiscourse

A key competency that we seek to build in learners is a critical mind, i.e. ability to engage with the ideas in the literature, and to identify when significant claims are being made in articles. The ability to decode such moves in texts is essential, as is the ability to make such moves in one’s own writing. Computational techniques for extracting them are becoming available, using Natural Language Processing (NLP) tuned to recognize the rhetorical signals that authors use when making a significant scholarly move. After reviewing related NLP work, we introduce the Xerox Incremental Parser (XIP), note previous work to render its output, and then motivate the design of the XIP Dashboard, a set of visual analytics modules built on XIP output, using the LAK/EDM open dataset as a test corpus. We report preliminary user reactions to a paper prototype of such a novel dashboard, describe the visualizations implemented to date, and present user scenarios for learners, educators and researchers. We conclude with a summary of ongoing design refinements, potential platform integrations, and questions that need to be investigated through end-user evaluations.

[1]  Antoine Geissbühler,et al.  Using Discourse Analysis to Improve Text Categorization in MEDLINE , 2007, MedInfo.

[2]  John M. Swales,et al.  Genre Analysis: English in Academic and Research Settings , 1993 .

[3]  Russ B. Altman,et al.  Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text , 2009, BMC Bioinformatics.

[4]  Simone Teufel,et al.  Towards Domain-Independent Argumentative Zoning: Evidence from Chemistry and Computational Linguistics , 2009, EMNLP.

[5]  A. Lancaster,et al.  Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces , 2004, IEEE Transactions on Professional Communication.

[6]  Aaron N. Kaplan,et al.  Discovering Paradigm Shift Patterns in Biomedical Abstracts: Application to Neurodegenerative Diseases , 2005 .

[7]  Joost Kircz,et al.  Modularity: the next form of scientific information presentation? , 1998, J. Documentation.

[8]  Patrick Saint-Dizier,et al.  Analyzing Argumentative Structures in Procedural Texts , 2008, GoTAL.

[9]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[10]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[11]  Simon Buckingham Shum,et al.  Discourse-centric learning analytics , 2011, LAK.

[12]  Dietrich Rebholz-Schuhmann,et al.  Using argumentation to extract key sentences from biomedical abstracts , 2007, Int. J. Medical Informatics.

[13]  Christopher Culy,et al.  LiveTree: An Integrated Workbench for Discourse Processing , 2004, ACL 2004.

[14]  Daniel Marcu,et al.  Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays , 2003, IEEE Intell. Syst..

[15]  Elena Cotos,et al.  Automatic Identification of Discourse Moves in Scientific Article Introductions , 2008 .

[16]  Simone Teufel,et al.  Argumentative Zoning Applied to Critiquing Novices' Scientific Abstracts , 2006, Computing Attitude and Affect in Text.

[17]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[18]  K. Hyland,et al.  PERSUASION AND CONTEXT: THE PRAGMATICS OF ACADEMIC METADISCOURSE , 1998 .

[19]  Janyce Wiebe,et al.  Computing Attitude and Affect in Text: Theory and Applications , 2005, The Information Retrieval Series.

[20]  Simon Buckingham Shum,et al.  The open education evidence hub: a collective intelligence tool for evidence based policy , 2012 .

[21]  William B. Langdon,et al.  BioRAT: extracting biological information from full-length papers , 2004, Bioinform..

[22]  Simon Buckingham Shum,et al.  Contested Collective Intelligence: Rationale, Technologies, and a Human-Machine Annotation Study , 2012, Computer Supported Cooperative Work (CSCW).

[23]  Simone Teufel Argumentative Zoning for Improved Citation Indexing , 2006, Computing Attitude and Affect in Text.