Automatic Extraction of Nanoparticle Properties Using Natural Language Processing: NanoSifter an Application to Acquire PAMAM Dendrimer Properties

In this study, we demonstrate the use of natural language processing methods to extract, from nanomedicine literature, numeric values of biomedical property terms of poly(amidoamine) dendrimers. We have developed a method for extracting these values for properties taken from the NanoParticle Ontology, using the General Architecture for Text Engineering and a Nearly-New Information Extraction System. We also created a method for associating the identified numeric values with their corresponding dendrimer properties, called NanoSifter. We demonstrate that our system can correctly extract numeric values of dendrimer properties reported in the cancer treatment literature with high recall, precision, and f-measure. The micro-averaged recall was 0.99, precision was 0.84, and f-measure was 0.91. Similarly, the macro-averaged recall was 0.99, precision was 0.87, and f-measure was 0.92. To our knowledge, these results are the first application of text mining to extract and associate dendrimer property terms and their corresponding numeric values.

[1]  Sam Zaremba,et al.  Text-mining of PubMed abstracts by natural language processing to create a public knowledge base on molecular mechanisms of bacterial enteropathogens , 2009, BMC Bioinformatics.

[2]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[3]  V Maojo,et al.  International Efforts in Nanoinformatics Research Applied to Nanomedicine , 2010, Methods of Information in Medicine.

[4]  Miguel García-Remesal,et al.  Using Nanoinformatics Methods for Automatically Identifying Relevant Nanotoxicology Entities from the Literature , 2012, BioMed research international.

[5]  Nancy Staggers,et al.  Nanotechnology: the coming revolution and its implications for consumers, clinicians, and informatics. , 2008, Nursing outlook.

[6]  Damien Chaussabel,et al.  Biomedical Literature Mining , 2004, American journal of pharmacogenomics : genomics-related research in drug development and clinical practice.

[7]  Yael Garten,et al.  Recent progress in automatically extracting information from the pharmacogenomic literature. , 2010, Pharmacogenomics.

[8]  William R. Hogan,et al.  Natural Language Processing methods and systems for biomedical ontology learning , 2011, J. Biomed. Informatics.

[9]  V. Pillay,et al.  Patenting of nanopharmaceuticals in drug delivery: no small issue. , 2007, Recent patents on drug delivery & formulation.

[10]  Nathan A. Baker,et al.  NanoParticle Ontology for cancer nanotechnology research , 2011, J. Biomed. Informatics.

[11]  Sujatha Kannan,et al.  Drug complexation, in vitro release and cellular entry of dendrimers and hyperbranched polymers. , 2003, International journal of pharmaceutics.

[12]  Robert Langer,et al.  A family of hierarchically self-assembling linear-dendritic hybrid polymers for highly efficient targeted gene delivery. , 2005, Angewandte Chemie.

[13]  Kalina Bontcheva,et al.  Text Processing with GATE , 2011 .

[14]  Zhiyong Lu,et al.  OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression , 2008, BMC Bioinformatics.

[15]  Yiming Yang,et al.  An Evaluation of Statistical Approaches to Text Categorization , 1999, Information Retrieval.

[16]  K. Jain The Handbook of Nanomedicine , 2008, Humana Press.