SemPathFinder: Semantic path analysis for discovering publicly unknown knowledge

The enormous amount of biomedicine's natural-language texts creates a daunting challenge to discover novel and interesting patterns embedded in the text corpora that help biomedical professionals find new drugs and treatments. These patterns constitute entities such as genes, compounds, treatments, and side effects and their associations that spread across publications in different biomedical specialties. This paper proposes SemPathFinder to discover previously unknown relations in biomedical text. SemPathFinder overcomes the problems of Swanson's ABC model by using semantic path analysis to tell a story about plausible connections between biological terms. Storytelling-based semantic path analysis can be viewed as relation navigation for bio-entities that are semantically close to each other, and reveals insight into how a series of entity pairs is organized, and how it can be harnessed to explain seemingly unrelated connections. We apply SemPathFinder for two well-known use cases of Swanson's ABC model, and the experimental results show that SemPathFinder detects all intermediate terms except for one and also infers several interesting new hypotheses.

[1]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[2]  Von-Wun Soo,et al.  Analysis of adverse drug reactions using drug and drug target interactions and graph-based methods , 2010, Artif. Intell. Medicine.

[3]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[4]  Anna Korhonen,et al.  Improving Verb Clustering with Automatically Acquired Selectional Preferences , 2009, EMNLP.

[5]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[6]  Jun S. Liu,et al.  Integrated Bio-Entity Network: A System for Biological Knowledge Discovery , 2011, PloS one.

[7]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[8]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[9]  Maurice Bouwhuis,et al.  CoPub: a literature-based keyword enrichment tool for microarray data analysis , 2008, Nucleic Acids Res..

[10]  A. Liekens,et al.  BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation , 2011, Genome Biology.

[11]  D. Swanson A second example of mutually isolated medical literatures related by implicit, unnoticed connections. , 1989 .

[12]  Cynthia Brandt,et al.  Semantic similarity in the biomedical domain: an evaluation across knowledge sources , 2012, BMC Bioinformatics.

[13]  Burr Settles ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text , 2005 .

[14]  John Scott What is social network analysis , 2010 .

[15]  Yukiko Matsuoka,et al.  PathText: a text mining integrator for biological pathway visualizations , 2010, Bioinform..

[16]  D. Swanson Migraine and Magnesium: Eleven Neglected Connections , 2015, Perspectives in biology and medicine.

[17]  Michael Krauthammer,et al.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles , 2001, ISMB.

[18]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[19]  Yizhou Sun,et al.  Meta-Path-Based Search and Mining in Heterogeneous Information Networks , 2013 .

[20]  Eric Ruppert Finding the k Shortest Paths in Parallel , 1997, STACS.

[21]  D. Swanson Undiscovered Public Knowledge , 1986 .

[22]  Sophia Ananiadou,et al.  Discovering and visualizing indirect associations between biomedical concepts , 2011, Bioinform..

[23]  Amit P. Sheth,et al.  A graph-based recovery and decomposition of Swanson's hypothesis using semantic predications , 2013, J. Biomed. Informatics.

[24]  L. Schrama,et al.  Modulation of protein synthesis in a cell‐free system derived from rat brain by corticotropin (ACTH), magnesium, and spermine , 1984, Journal of neuroscience research.

[25]  E. Kossoff,et al.  Migraine and Epilepsy in the Pediatric Population , 2014, Current Pain and Headache Reports.

[26]  J. Qiu,et al.  Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA , 2011, PloS one.

[27]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[28]  Russ B. Altman,et al.  Integration and publication of heterogeneous text-mined relationships on the Semantic Web , 2011, J. Biomed. Semant..

[29]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[30]  Marcelo Fiszman,et al.  Discovery Browsing with Semantic Predications and Graph Theory , 2011 .

[31]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[32]  A. Rivas,et al.  Discovering Novel Causal Patterns From Biomedical Natural-Language Texts Using Bayesian Nets , 2008, IEEE Transactions on Information Technology in Biomedicine.

[33]  Jacob de Vlieg,et al.  Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases , 2010, PLoS Comput. Biol..

[34]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[35]  N. Singewald,et al.  Magnesium deficiency induces anxiety and HPA axis dysregulation: Modulation by therapeutic drug treatment , 2012, Neuropharmacology.

[36]  B. Carpenter,et al.  LingPipe for 99.99% Recall of Gene Mentions , 2007 .

[37]  K. Goa,et al.  Diclofenac-potassium in migraine: a review. , 1999, Drugs.

[38]  R. DiGiacomo,et al.  Fish-oil dietary supplementation in patients with Raynaud's phenomenon: a double-blind, controlled, prospective study. , 1989, The American journal of medicine.

[39]  Snehasis Mukhopadhyay,et al.  TransMiner: Mining Transitive Associations among Biological Objects from Text , 2004, Journal of Biomedical Science.

[40]  Leathard Hl New possibilities for anti-migraine drugs: prostanoid antagonists and progesterone-mimicking stabilizers of excitable cells. , 1989 .

[41]  Joyce A. Mitchell,et al.  Using literature-based discovery to identify disease candidate genes , 2005, Int. J. Medical Informatics.

[42]  M. Buckley,et al.  Fish Oil Interaction with Warfarin , 2004, The Annals of pharmacotherapy.

[43]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[44]  Judith A. Allen,et al.  Treatment of severe Raynaud's phenomenon with prostaglandin E1 , 1981, Irish journal of medical science.

[45]  Mikhail V. Blagosklonny,et al.  Conceptual biology: Unearthing the gems , 2002, Nature.

[46]  Peer Bork,et al.  Extraction of regulatory gene/protein networks from Medline , 2006, Bioinform..

[47]  Snehasis Mukhopadhyay,et al.  Multi-way association extraction and visualization from biological text documents using hyper-graphs: Applications to genetic association studies for diseases , 2010, Artif. Intell. Medicine.

[48]  Bradley M. Hemminger,et al.  Mining connections between chemicals, proteins, and diseases extracted from Medline annotations , 2010, J. Biomed. Informatics.

[49]  G Abbritti,et al.  Serum and Salivary Magnesium Levels in Migraine. Results in a Group of Juvenile Patients , 1992, Headache.

[50]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[51]  Marc Weeber,et al.  Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001, J. Assoc. Inf. Sci. Technol..

[52]  G Burnstock,et al.  The role of adenosine triphosphate in migraine. , 1989, Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie.

[53]  Marc Weeber,et al.  Using concepts in literature-based discovery: simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001 .