Predicting High-Throughput Screening Results With Scalable Literature-Based Discovery Methods

The identification of new therapeutic uses for existing agents has been proposed as a means to mitigate the escalating cost of drug development. A common approach to such repurposing involves screening libraries of agents for activities against cell lines. In silico methods using knowledge from the biomedical literature have been proposed to constrain the costs of screening by identifying agents that are likely to be effective a priori. However, results obtained with these methods are seldom evaluated empirically. Conversely, screening experiments have been criticized for their inability to reveal the biological basis of their results. In this paper, we evaluate the ability of a scalable literature‐based approach, discovery‐by‐analogy, to identify a small number of active agents within a large library screened for activity against prostate cancer cells. The methods used permit retrieval of the knowledge used to infer their predictions, providing a plausible biological basis for predicted activity.

[1]  A. Persidis,et al.  Literature analysis for systematic drug repurposing: a case study from Biovista , 2011 .

[2]  Kelvin K. W. Chan,et al.  The statins as anticancer agents. , 2003, Clinical cancer research : an official journal of the American Association for Cancer Research.

[3]  A. Jemal,et al.  Cancer statistics, 2013 , 2013, CA: a cancer journal for clinicians.

[4]  Halil Kilicoglu,et al.  Arguments of Nominals in Semantic Interpretation of Biomedical Text , 2010, BioNLP@ACL.

[5]  Dominic Widdows,et al.  Semantic Vectors: a Scalable Open Source Package and Online Technology Management Application , 2008, LREC.

[6]  Padmini Srinivasan,et al.  Text mining: Generating hypotheses from MEDLINE , 2004, J. Assoc. Inf. Sci. Technol..

[7]  R. Sharan,et al.  PREDICT: a method for inferring novel drug indications with application to personalized medicine , 2011, Molecular systems biology.

[8]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[9]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[10]  Jacob de Vlieg,et al.  Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases , 2010, PLoS Comput. Biol..

[11]  S. Peters,et al.  Word Vectors and Quantum Logic Experiments with negation and disjunction , 2003 .

[12]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[13]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[14]  Peter Davies,et al.  Discovering discovery patterns with predication-based Semantic Indexing , 2012, J. Biomed. Informatics.

[15]  Vassilis Virvilis,et al.  Literature mining, ontologies and information visualization for drug repurposing , 2011, Briefings Bioinform..

[16]  Alan R. Aronson,et al.  An overview of MetaMap: historical perspective and recent advances , 2010, J. Am. Medical Informatics Assoc..

[17]  Trevor Cohen,et al.  Real, Complex, and Binary Semantic Vectors , 2012, QI.

[18]  S. Bojesen,et al.  Statin use and reduced cancer-related mortality. , 2012, The New England journal of medicine.

[19]  Patrick Ruch Literature-based Discovery , 2010, J. Assoc. Inf. Sci. Technol..

[20]  Trevor Cohen,et al.  Predication-based Semantic Indexing: Permutations as a Means to Encode Predications in Semantic Space , 2009, AMIA.

[21]  Carol Friedman,et al.  Exploiting Semantic Relations for Literature-Based Discovery , 2006, AMIA.

[22]  Halil Kilicoglu,et al.  SemMedDB: a PubMed-scale repository of biomedical semantic predications , 2012, Bioinform..

[23]  Trevor Cohen,et al.  Empirical distributional semantics: Methods and biomedical applications , 2009, J. Biomed. Informatics.

[24]  Michael W. Berry,et al.  Mathematical Foundations Behind Latent Semantic Analysis , 2007 .

[25]  M. Rivera,et al.  Analysis of genomic and proteomic data using advanced literature mining. , 2003, Journal of proteome research.

[26]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[27]  Peter Bruza,et al.  A Bare Bones Approach to Literature-Based Discovery: An Analysis of the Raynaud's/Fish-Oil and Migraine-Magnesium Discoveries in Semantic Space , 2005, Discovery Science.

[28]  Borut Peterlin,et al.  Using literature-based discovery to identify novel therapeutic approaches. , 2013, Cardiovascular & hematological agents in medicinal chemistry.

[29]  Halil Kilicoglu,et al.  Using the Literature-Based Discovery Paradigm to Investigate Drug Mechanisms , 2007, AMIA.

[30]  J. Lechner,et al.  Establishment and characterization of a human prostatic carcinoma cell line (PC-3). , 1979, Investigative urology.

[31]  Trevor Cohen,et al.  Many Paths Lead to Discovery: Analogical Retrieval of Cancer Therapies , 2012, QI.

[32]  Sean Ekins,et al.  In silico repositioning of approved drugs for rare and neglected diseases. , 2011, Drug discovery today.

[33]  A. Chiang,et al.  Systematic Evaluation of Drug–Disease Relationships to Identify Leads for Novel Drug Uses , 2009, Clinical pharmacology and therapeutics.

[34]  Trevor Cohen,et al.  Reflective Random Indexing and indirect inference: A scalable method for discovery of implicit connections , 2010, J. Biomed. Informatics.

[35]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[36]  Trevor Cohen,et al.  Discovery at a distance: Farther journeys in predication space , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[37]  Trevor Cohen,et al.  The Semantic Vectors Package: New Algorithms and Public Tools for Distributional Semantics , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[38]  Ross W. Gayler Vector Symbolic Architectures answer Jackendoff's challenges for cognitive neuroscience , 2004, ArXiv.

[39]  A. Persidis,et al.  Drug repurposing and adverse event prediction using high‐throughput literature analysis , 2011, Wiley interdisciplinary reviews. Systems biology and medicine.

[40]  I. Tannock,et al.  Docetaxel plus prednisone or mitoxantrone plus prednisone for advanced prostate cancer. , 2004, The New England journal of medicine.

[41]  W. Janzen,et al.  High Throughput Screening , 2016, Methods in Molecular Biology.

[42]  Trevor Cohen,et al.  Logical Leaps and Quantum Connectives: Forging Paths through Predication Space , 2010, AAAI Fall Symposium: Quantum Informatics for Cognitive, Social, and Semantic Processes.

[43]  T. Ashburn,et al.  Drug repositioning: identifying and developing new uses for existing drugs , 2004, Nature Reviews Drug Discovery.

[44]  H. Dishkant,et al.  Logic of Quantum Mechanics , 1976 .

[45]  Marc Weeber,et al.  Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001, J. Assoc. Inf. Sci. Technol..

[46]  Marc Weeber,et al.  Using concepts in literature-based discovery: simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001 .

[47]  A. Butte,et al.  Network Medicine in Disease Analysis and Therapeutics , 2013, Clinical pharmacology and therapeutics.

[48]  Susan T. Dumais,et al.  Using Latent Semantic Indexing for Literature Based Discovery , 1998, J. Am. Soc. Inf. Sci..

[49]  Carol Friedman,et al.  Literature-Based Knowledge Discovery using Natural Language Processing , 2008 .

[50]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[51]  Trevor Cohen,et al.  Finding Schizophrenia's Prozac Emergent Relational Similarity in Predication Space , 2011, QI.