Use of “Internal Knowledge”: Biomedical Literature Search Liberated From External Resources

Knowledge plays an essential role in biomedical literature search (BLS) systems, filling the semantic gap between queries and documents. Knowledge bases, constructed by human experts or machine learning methods, are generally regarded as the main sources serving external knowledge. However, a good knowledge base must balances its particularity and generalization, resulting limited knowledge coverage and utilization to BLS systems. Considering massive documents in a BLS system, and recently developing Open IE techniques by which we can automatically extract structured knowledge from documents, how about harnessing distilled internal knowledge rather than external knowledge to conduct BLS systems? Internal knowledge, providing tailored particular knowledge to BLS systems, is supposed to lead better knowledge utilization and much more competitive performance on literature search. In this paper, we design an novel internal knowledge driven BLS system upon a Multi-layered Encoders incorporating Multi-layered internal Knowledge graph, called MEMK. MEMK harnesses distilled internal structural knowledge, empowering interactive representations learning of query and documents. The experiments show that MEMK outperforms strong baselines on a public benchmark, and internal knowledge based query expansion can further improve the performance to a new state of the art.

[1]  Zhiyong Lu,et al.  Best Match: New relevance search for PubMed , 2018, PLoS biology.

[2]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[3]  Nitesh V. Chawla,et al.  The Role of "Condition": A Novel Scientific Knowledge Graph Representation and Construction Model , 2019, KDD.

[4]  Georgios Balikas,et al.  An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition , 2015, BMC Bioinformatics.

[5]  Jimmy J. Lin,et al.  Applying BERT to Document Retrieval with Birch , 2019, EMNLP.

[6]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[7]  Xu-Cheng Yin,et al.  A Multi-strategy Query Processing Approach for Biomedical Question Answering: USTB_PRIR at BioASQ 2017 Task 5B , 2017, BioNLP.

[8]  Ioannis A. Kakadiaris,et al.  Results of the sixth edition of the BioASQ Challenge , 2018 .

[9]  Kyunghyun Cho,et al.  Passage Re-ranking with BERT , 2019, ArXiv.

[10]  Nitesh V. Chawla,et al.  Multi-Input Multi-Output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text , 2019, EMNLP/IJCNLP.

[11]  Erik Faessler,et al.  HPI-DHC at TREC 2018 Precision Medicine Track , 2018, TREC.

[12]  Yanchun Zhang,et al.  The Fudan Participation in the 2015 BioASQ Challenge: Large-scale Biomedical Semantic Indexing and Question Answering , 2015, CLEF.

[13]  Jun Xu,et al.  Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Bowen Zhou,et al.  End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion , 2018, AAAI.

[16]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[17]  Qingkai Zeng,et al.  Biomedical Knowledge Graphs Construction From Conditional Statements , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Canjia Li,et al.  UCAS at TREC-2017 Precision Medicine Track , 2017, TREC.

[19]  Utpal Garain,et al.  Using Word Embeddings for Automatic Query Expansion , 2016, ArXiv.

[20]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[21]  Giorgio Maria Di Nunzio,et al.  The University of Padua IMS Research Group at TREC 2018 Precision Medicine Track , 2018, TREC.

[22]  Gerard de Melo,et al.  Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval , 2017, WSDM.

[23]  Jimmy J. Lin,et al.  Simple Applications of BERT for Ad Hoc Document Retrieval , 2019, ArXiv.

[24]  Bhaskar Mitra,et al.  Neural Ranking Models with Multiple Document Fields , 2017, WSDM.

[25]  Nitesh V. Chawla,et al.  CTGA: Graph-based Biomedical Literature Search , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[26]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[27]  Jian Song,et al.  Team Cat-Garfield at TREC 2018 Precision Medicine Track , 2018, TREC.