Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

BackgroundScientific publications are documentary representations of defeasible arguments, supported by data and repeatable methods. They are the essential mediating artifacts in the ecosystem of scientific communications. The institutional “goal” of science is publishing results. The linear document publication format, dating from 1665, has survived transition to the Web.Intractable publication volumes; the difficulty of verifying evidence; and observed problems in evidence and citation chains suggest a need for a web-friendly and machine-tractable model of scientific publications. This model should support: digital summarization, evidence examination, challenge, verification and remix, and incremental adoption. Such a model must be capable of expressing a broad spectrum of representational complexity, ranging from minimal to maximal forms.ResultsThe micropublications semantic model of scientific argument and evidence provides these features. Micropublications support natural language statements; data; methods and materials specifications; discussion and commentary; challenge and disagreement; as well as allowing many kinds of statement formalization.The minimal form of a micropublication is a statement with its attribution. The maximal form is a statement with its complete supporting argument, consisting of all relevant evidence, interpretations, discussion and challenges brought forward in support of or opposition to it. Micropublications may be formalized and serialized in multiple ways, including in RDF. They may be added to publications as stand-off metadata.An OWL 2 vocabulary for micropublications is available at http://purl.org/mp. A discussion of this vocabulary along with RDF examples from the case studies, appears as OWL Vocabulary and RDF Examples in Additional file1.ConclusionMicropublications, because they model evidence and allow qualified, nuanced assertions, can play essential roles in the scientific communications ecosystem in places where simpler, formalized and purely statement-based models, such as the nanopublications model, will not be sufficient. At the same time they will add significant value to, and are intentionally compatible with, statement-based formalizations.We suggest that micropublications, generated by useful software tools supporting such activities as writing, editing, reviewing, and discussion, will be of great value in improving the quality and tractability of biomedical communications.

[1]  C. Begley,et al.  Drug development: Raise standards for preclinical cancer research , 2012, Nature.

[2]  W. Engel,et al.  Sporadic inclusion-body myositis and its similarities to Alzheimer disease brain. Recent approaches to diagnosis and pathogenesis, and relation to aging. , 1998, Scandinavian journal of rheumatology.

[3]  Naruhiko Sahara,et al.  Propagation of Tau Pathology in a Model of Early Alzheimer's Disease , 2012, Neuron.

[4]  P. Kleingeld,et al.  The Stanford Encyclopedia of Philosophy , 2013 .

[5]  J. Hardy,et al.  The Amyloid Hypothesis of Alzheimer ’ s Disease : Progress and Problems on the Road to Therapeutics , 2009 .

[6]  Phan Minh Dung,et al.  On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming and n-Person Games , 1995, Artif. Intell..

[7]  Alan Ruttenberg,et al.  The SWAN biomedical discourse ontology , 2008, J. Biomed. Informatics.

[8]  Anthony Hunter,et al.  Hybrid argumentation systems for structured news reports , 2001, The Knowledge Engineering Review.

[9]  Zhiyong Lu,et al.  PubMed and beyond: a survey of web tools for searching biomedical literature , 2011, Database J. Biol. Databases Curation.

[10]  Anthony Hunter,et al.  Elements of Argumentation , 2007, ECSQARU.

[11]  Hans-Michael Müller,et al.  Federated Access to Heterogeneous Information Resources in the Neuroscience Information Framework (NIF) , 2008, Neuroinformatics.

[12]  W. Engel,et al.  Light and electron microscopic localization of beta-amyloid protein in muscle biopsies of patients with inclusion-body myositis. , 1992, The American journal of pathology.

[13]  Claudette Cayrol,et al.  Bipolar abstract argumentation systems , 2009, Argumentation in Artificial Intelligence.

[14]  W. Engel,et al.  Novel Immunolocalization of α‐Synuclein in Human Muscle of Inclusion‐Body Myositis, Regenerating and Necrotic Muscle Fibers, and at Neuromuscular Junctions , 2000, Journal of neuropathology and experimental neurology.

[15]  Richard Van Noorden Science publishing: The trouble with retractions , 2011, Nature.

[16]  D. Walton,et al.  Argumentation Schemes and Defeasible Inferences , 2002 .

[17]  Marco Pahor,et al.  Rapamycin fed late in life extends lifespan in genetically heterogeneous mice , 2009, Nature.

[18]  Allen H. Renear,et al.  Strategic Reading, Ontologies, and the Future of Scientific Publishing , 2009, Science.

[19]  Simon Buckingham Shum,et al.  Hypotheses, evidence and relationships: The HypER approach for representing scientific knowledge claims , 2009, ISWC 2009.

[20]  A Valencia,et al.  An Overview of BioCreative II.5 , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[21]  Willard Van Orman Quine I.—Mr. STRAWSON ON LOGICAL THEORY , 1953 .

[22]  J. Ioannidis,et al.  Public Availability of Published Research Data in High-Impact Journals , 2011, PloS one.

[23]  Paul Groth The Anatomy of a Nano-publication , 2010 .

[24]  Na Na NEW POLICY ON MANUSCRIPT SUBMISSION TO THE JOURNAL OF NEUROPATHOLOGY AND EXPERIMENTAL NEUROLOGY , 1989 .

[25]  Stephen Wan,et al.  Supporting browsing-specific information needs: Introducing the Citation-Sensitive In-Browser Summariser , 2010, J. Web Semant..

[26]  Harald Hampel,et al.  Alzheimer's disease : modernizing concept, biological diagnosis and therapy , 2012 .

[27]  Buccafusco Jj,et al.  Transgenic Mouse Models of Alzheimer’s Disease: Behavioral Testing and Considerations -- Methods of Behavior Analysis in Neuroscience , 2009 .

[28]  W. Engel,et al.  Transfer of beta-amyloid precursor protein gene using adenovirus vector causes mitochondrial abnormalities in cultured normal human muscle. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Hans-Michael Müller,et al.  A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework , 2012, Database J. Biol. Databases Curation.

[30]  Karl Herrup,et al.  Current Conceptual View of Alzheimer’s Disease , 2012 .

[31]  F. Cox,et al.  History of Human Parasitology , 2002, Clinical Microbiology Reviews.

[32]  Karin Baier,et al.  The Uses Of Argument , 2016 .

[33]  Karolin Baecker,et al.  Inference to the Best Explanation: , 2021, The Material Theory of Induction.

[34]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[35]  R. D'Hooge,et al.  Applications of the Morris water maze in the study of learning and memory , 2001, Brain Research Reviews.

[36]  Yrjö Engeström,et al.  Communication, discourse and activity , 1999 .

[37]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[38]  Daniel G. Campos,et al.  On the distinction between Peirce’s abduction and Lipton’s Inference to the best explanation , 2011, Synthese.

[39]  Paul Tempst,et al.  RAFT1: A mammalian protein that binds to FKBP12 in a rapamycin-dependent fashion and is homologous to yeast TORs , 1994, Cell.

[40]  John Hardy,et al.  The amyloid hypothesis for Alzheimer’s disease: a critical reappraisal , 2009, Journal of neurochemistry.

[41]  Jayanta Debnath,et al.  Inhibition of mTOR by Rapamycin Abolishes Cognitive Deficits and Reduces Amyloid-β Levels in a Mouse Model of Alzheimer's Disease , 2010, PloS one.

[42]  Lefteris Farmakis Inference to the Best Explanation, 2nd edition , 2004 .

[43]  Andy Seaborne,et al.  SWAN: A distributed knowledge infrastructure for Alzheimer disease research , 2006, J. Web Semant..

[44]  Trevor J. M. Bench-Capon,et al.  Argumentation in artificial intelligence , 2007, Artif. Intell..

[45]  Lisa Jardine,et al.  The New Organon , 2008 .

[46]  G. Harman The Inference to the Best Explanation , 1965 .

[47]  A. Valencia,et al.  Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge , 2008, Genome Biology.

[48]  Jan Velterop Nanopublications The Future of Coping with Information Overload , 2010 .

[49]  T. Rauch,et al.  Hypothermia impairs performance in the Morris water maze , 1989, Physiology & Behavior.

[50]  W. Engel,et al.  New advances in the understanding of sporadic inclusion‐body myositis and hereditary inclusion‐body myopathies , 1995, Current opinion in rheumatology.

[51]  Karin M. Verspoor,et al.  BioC: a minimalist approach to interoperability for biomedical text processing , 2013, AMIA.

[52]  M. Schuemie,et al.  Anni 2.0: a multipurpose text-mining tool for the life sciences , 2008, Genome Biology.

[53]  David Bakhurst,et al.  Reflections on activity theory , 2009 .

[54]  Dietrich Rebholz-Schuhmann,et al.  Calbc Silver Standard Corpus , 2010, J. Bioinform. Comput. Biol..

[55]  Zhiyong Lu,et al.  Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases , 2011 .

[56]  David M. Shotton,et al.  Semantic publishing: the coming revolution in scientific journal publishing , 2009, Learn. Publ..

[57]  Micah Altman,et al.  A Proposed Standard for the Scholarly Citation of Quantitative Data , 2008, IASSIST Conference.

[58]  Martin Beibel,et al.  Transmission and spreading of tauopathy in transgenic mouse brain , 2009, Nature Cell Biology.

[59]  Jian Su,et al.  Empirical Investigations into Full-Text Protein Interaction Article Categorization Task (ACT) in the BioCreative II.5 Challenge , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[60]  J. Hardy,et al.  Alzheimer's disease: the amyloid cascade hypothesis. , 1992, Science.

[61]  A. Casadevall,et al.  Misconduct accounts for the majority of retracted scientific publications , 2012, Proceedings of the National Academy of Sciences.

[62]  Vwani P. Roychowdhury,et al.  A mathematical theory of citing , 2005, J. Assoc. Inf. Sci. Technol..

[63]  Ulysses Paulino Albuquerque,et al.  Citation behavior in popular scientific papers: what is behind obscure citations? The case of ethnobotany , 2012, Scientometrics.

[64]  Michael Maher Attempto Controlled English (ACE) A Seemingly Informal Bridgehead in Formal Territory , 1996 .

[65]  P. Strawson III.—ON REFERRING , 1950 .

[66]  S. Lazic,et al.  A call for transparent reporting to optimize the predictive value of preclinical research , 2012, Nature.

[67]  Hans-Michael Müller,et al.  The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience , 2008, Neuroinformatics.

[68]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[69]  R. Nicoll,et al.  Plaque-independent disruption of neural circuits in Alzheimer's disease mouse models. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[70]  Serena Villata,et al.  Support in Abstract Argumentation , 2010, COMMA.

[71]  Xuan Guo-liang,et al.  Knowledge Value Chain , 2006 .

[72]  George Perry,et al.  Transgenic Mouse Models of Alzheimer’s Disease: Behavioral Testing and Considerations , 2009 .

[73]  Trevor J. M. Bench-Capon,et al.  Computational Models of Argument , 2006 .

[74]  Fabio Rinaldi,et al.  Attempto Controlled English: A Knowledge Representation Language Readable by Humans and Machines , 2005, Reasoning Web.

[75]  Tim Clark,et al.  Open semantic annotation of scientific publications using DOMEO , 2012, J. Biomed. Semant..

[76]  Steven A Greenberg Understanding belief using citation networks. , 2011, Journal of evaluation in clinical practice.

[77]  Zhiyong Lu,et al.  The gene normalization task in BioCreative III , 2011, BMC Bioinformatics.

[78]  Silvio Peroni,et al.  CiTO + SWAN: The web semantics of bibliographic records, citations, evidence and discourse relationships , 2014, Semantic Web.

[79]  Mike Tyers,et al.  Benchmarking of the 2010 BioCreative Challenge III text-mining competition by the BioGRID and MINT interaction databases , 2011, BMC Bioinformatics.

[80]  Gary Gereffi,et al.  Beyond the Producer-drivenl Buyer-driven Dichotomy The Evolution of Global Value Chains in the Internet Era , 2009 .

[81]  K. Bretonnel Cohen,et al.  A critical review of PASBio's argument structures for biomedical verbs , 2006, BMC Bioinformatics.

[82]  Clyde W. Holsapple,et al.  The knowledge chain model: activities for competitiveness , 2001, Expert Syst. Appl..

[83]  Frederic L. Holmes 6. Argument and Narrative in Scientific Writing , 1991 .

[84]  Steven A Greenberg,et al.  How citation distortions create unfounded authority: analysis of a citation network , 2009, BMJ : British Medical Journal.

[85]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[86]  R. Armstrong,et al.  The Pathogenesis of Alzheimer's Disease: A Reevaluation of the “Amyloid Cascade Hypothesis” , 2011, International journal of Alzheimer's disease.

[87]  Bart Verheij Evaluating Arguments Based on Toulmin’s Scheme , 2005 .

[88]  Vwani P. Roychowdhury,et al.  Stochastic modeling of citation slips , 2004, Scientometrics.

[89]  Zhiyong Lu,et al.  BioCreative-2012 Virtual Issue , 2012, Database J. Biol. Databases Curation.

[90]  Barend Mons,et al.  Open PHACTS: semantic interoperability for drug discovery. , 2012, Drug discovery today.

[91]  Jonathan D. Edwards,et al.  Abnormal motor phenotype in the SMNΔ7 mouse model of spinal muscular atrophy , 2007, Neurobiology of Disease.

[92]  Alexander Zahar The literary structure of scientific argument: historical studies , 1992, Medical History.

[93]  S. Shapin Pump and Circumstance: Robert Boyle's Literary Technology , 1984 .

[94]  Tim Clark Next Generation Scientific Publishing and the Web of Data , 2014, Semantic Web.

[95]  R. Morris Spatial Localization Does Not Require the Presence of Local Cues , 1981 .

[96]  Kang Hu,et al.  High-Level Neuronal Expression of Aβ1–42 in Wild-Type Human Amyloid Protein Precursor Transgenic Mice: Synaptotoxicity without Plaque Formation , 2000, The Journal of Neuroscience.

[97]  Carole Goble,et al.  Discoveries and Anti-Discoveries on the Web of Argument and Data , 2014, AAAI 2014.

[98]  Stian Soiland-Reyes,et al.  Web Annotation as a First-Class Object , 2013, IEEE Internet Computing.

[99]  Norbert E. Fuchs,et al.  Attempto Controlled English (ACE) , 1996, ArXiv.

[100]  Jeyakumar Natarajan,et al.  An overview of the BioCreative 2012 Workshop Track III: interactive text mining task , 2013, Database J. Biol. Databases Curation.

[101]  W. Engel,et al.  βAPP gene transfer into cultured human muscle induces inclusion‐body myositis aspects , 1997 .

[102]  Jeremy J. Carroll,et al.  Named graphs, provenance and trust , 2005, WWW '05.

[103]  K. Hyland,et al.  Writing Without Conviction? Hedging in Science Research Articles , 1996 .

[104]  Stian Soiland-Reyes,et al.  PAV ontology: provenance, authoring and versioning , 2013, J. Biomed. Semant..

[105]  Antonio Lobo,et al.  Alzheimer's disease: Modernizing concept, biological diagnosis and therapy , 2013 .

[106]  Shile Huang,et al.  Mechanisms of resistance to rapamycins. , 2001, Drug resistance updates : reviews and commentaries in antimicrobial and anticancer chemotherapy.

[107]  Bart Verheij,et al.  The Toulmin Argument Model in Artificial Intelligence Or: how semi-formal, defeasible argumentation schemes creep into logic , 2009 .

[108]  S. Pimplikar,et al.  Reassessing the amyloid cascade hypothesis of Alzheimer's disease. , 2009, The international journal of biochemistry & cell biology.

[109]  John Domingue,et al.  Visualizing Internetworked Argumentation , 2003, Visualizing Argumentation.

[110]  Peter Neuhaus,et al.  mTOR inhibitors: An overview , 2001, Liver transplantation : official publication of the American Association for the Study of Liver Diseases and the International Liver Transplantation Society.

[111]  Frank L Mastaglia,et al.  Inclusion body myositis: current pathogenetic concepts and diagnostic and therapeutic approaches , 2007, The Lancet Neurology.

[112]  Paul T. Groth,et al.  The anatomy of a nanopublication , 2010, Inf. Serv. Use.

[113]  Steve Pettifer,et al.  Utopia documents: linking scholarly literature with research data , 2010, Bioinform..

[114]  Lynette Hirschman,et al.  The FEBS Letters/BioCreative II.5 experiment: making biological information accessible , 2010, Nature Biotechnology.

[115]  P. Ciccarese,et al.  Annotation Ontology for Science on the Web , 2010 .

[116]  C. R.,et al.  On referring , 1950 .

[117]  Harold Varmus,et al.  Rescuing US biomedical research from its systemic flaws , 2014, Proceedings of the National Academy of Sciences.

[118]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[119]  Norbert E. Fuchs,et al.  Attempto Controlled English (ACE) A Seemingly Informal Bridgehead in Formal Territory (Poster Abstract) , 1996, JICSLP.

[120]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[121]  F L Mastaglia,et al.  Genetics of inclusion‐body myositis , 2007, Muscle & nerve.

[122]  Peter F. Patel-Schneider,et al.  A Syntax for Rules in OWL 2 , 2009, OWLED.

[123]  Paolo Ciccarese,et al.  DOMEO: a web-based tool for semantic annotation of online documents , 2012 .

[124]  Leyla Jael García Castro,et al.  An open annotation ontology for science on web 3.0 , 2011, J. Biomed. Semant..

[125]  B. Mons,et al.  Nano-Publication in the e-science era , 2009 .

[126]  D. Selkoe,et al.  Twisted tubulofilaments of inclusion body myositis muscle resemble paired helical filaments of Alzheimer brain and contain hyperphosphorylated tau. , 1994, The American journal of pathology.

[127]  Zhiyong Lu,et al.  Overview of the BioCreative III Workshop , 2011, BMC Bioinformatics.

[128]  D. Kell,et al.  Calling International Rescue: knowledge lost in literature and data landslide! , 2009, The Biochemical journal.

[129]  James Cheney,et al.  PROV-O: The PROV ontology:W3C recommendation 30 April 2013 , 2013 .

[130]  K. Cohen,et al.  Biomedical language processing: what's beyond PubMed? , 2006, Molecular cell.

[131]  Rob W.W. Hooft,et al.  The value of data , 2011, Nature Genetics.

[132]  Micah Altman,et al.  A Digital Library for the Dissemination and Replication of Quantitative Social Science Research , 2001 .

[133]  Timothy Clark,et al.  Open Annotation Data Model , 2013 .

[134]  G. Small,et al.  The pathogenesis of Alzheimer's disease. , 1998, The Journal of clinical psychiatry.

[135]  Zhiyong Lu,et al.  BioCreative III interactive task: an overview , 2011, BMC Bioinformatics.

[136]  Tim Clark,et al.  Alzforum and SWAN: the present and future of scientific web communities , 2007, Briefings Bioinform..