Life-iNet: A Structured Network-Based Knowledge Exploration and Analytics System for Life Sciences

Search engines running on scientific literature have been widely used by life scientists to find publications related to their research. However, existing search engines in the life-science domain, such as PubMed, have limitations when applied to exploring and analyzing factual knowledge (e.g., disease-gene associations) in massive text corpora. These limitations are mainly due to the problems that factual information exists as an unstructured form in text, and also keyword and MeSH term-based queries cannot effectively imply semantic relations between entities. This demo paper presents the Life-iNet system to address the limitations in existing search engines on facilitating life sciences research. Life-iNet automatically constructs structured networks of factual knowledge from large amounts of background documents, to support efficient exploration of structured factual knowledge in the unstructured literature. It also provides functionalities for finding distinctive entities for given entity types, and generating hypothetical facts to assist literaturebased knowledge discovery (e.g., drug target prediction).

[1]  Ulf Leser,et al.  GeneView: a comprehensive semantic search engine for PubMed , 2012, Nucleic Acids Res..

[2]  Yang Jin,et al.  Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE , 2005, ACL.

[3]  Andrew McCallum,et al.  Fast and Robust Joint Models for Biomedical Event Extraction , 2011, EMNLP.

[4]  Clare R. Voss,et al.  ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering , 2015, KDD.

[5]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[6]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[7]  Heng Ji,et al.  Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding , 2016, KDD.

[8]  Jiawei Han,et al.  Mining Quality Phrases from Massive Text Corpora , 2015, SIGMOD Conference.

[9]  Yizhou Sun,et al.  NewsNetExplorer: automatic construction and exploration of news information networks , 2014, SIGMOD Conference.

[10]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[11]  Jiawei Han,et al.  Multi-Dimensional, Phrase-Based Summarization in Text Cubes , 2016, IEEE Data Eng. Bull..

[12]  Heng Ji,et al.  AFET: Automatic Fine-Grained Entity Typing by Hierarchical Partial-Label Embedding , 2016, EMNLP.

[13]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[14]  Dietrich Rebholz-Schuhmann,et al.  MedEvi: Retrieving textual evidence of relations between biomedical concepts from Medline , 2008, Bioinform..

[15]  Gerhard Weikum,et al.  DeepLife: An Entity-aware Search, Analytics and Exploration Platform for Health and Life Sciences , 2016, ACL.

[16]  Hoifung Poon,et al.  Literome: PubMed-scale genomic knowledge base in the cloud , 2014, Bioinform..

[17]  Heng Ji,et al.  CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases , 2016, WWW.