Mining hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic‐based association rule

The novel connection between Raynaud disease and fish oils was uncovered from two disjointed biomedical literature sets by Swanson in 1986. Since then, there have been many approaches to uncover novel connections by mining the biomedical literature. One of the popular approaches is to adapt the association rule (AR) method to automatically identify implicit novel connections between concept A and concept C from two disjointed sets of documents through intermediate B concept. Since A and C concepts do not occur together in the same data set, the mining goal is to find novel connection among A and C concepts in the disjoint data sets. It first applies association rule to the two disjointed biomedical literature sets separately to generate two rule sets (A→B, B→C), and then applies transitive law to get the novel connections A→C. However, this approach generates a huge number of possible connections among the millions of biomedical concepts and a lot of these hypothetical connections are spurious, useless, and/or biologically meaningless. Thus it is essential to develop new approach to generate highly likely novel and biologically relevant connections among the biomedical concepts. This paper presents a biomedical semantic‐based association rule system (Bio‐SARS) that significantly reduce spurious/useless/biologically irrelevant connections through semantic filtering. Compared to other approaches such as latent semantic indexing and traditional association rule‐based approach, our approach generates much fewer rules and a lot of these rules represent relevant connections among biological concepts. © 2009 Wiley Periodicals, Inc.

[1]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[2]  Saso Dzeroski,et al.  Supporting Discovery in Medicine by Association Rule Mining in Medline and UMLS , 2001, MedInfo.

[3]  Michael D. Gordon,et al.  Literature-Based Discovery by Lexical Statistics , 1999, J. Am. Soc. Inf. Sci..

[4]  Susan T. Dumais,et al.  Using Latent Semantic Indexing for Literature Based Discovery , 1998, J. Am. Soc. Inf. Sci..

[5]  Padmini Srinivasan,et al.  Text mining: Generating hypotheses from MEDLINE , 2004, J. Assoc. Inf. Sci. Technol..

[6]  Michael D. Gordon,et al.  Toward Discovery Support Systems: A Replication, Re-Examination, and Extension of Swanson's Work on Literature-Based Discovery of a Connection between Raynaud's and Fish Oil , 1996, J. Am. Soc. Inf. Sci..

[7]  D. Swanson Migraine and Magnesium: Eleven Neglected Connections , 2015, Perspectives in biology and medicine.

[8]  Wanda Pratt,et al.  H.3.3 Information Search and Retrieval , 2022 .

[9]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[10]  D. Swanson Fish Oil, Raynaud's Syndrome, and Undiscovered Public Knowledge , 2015, Perspectives in biology and medicine.

[11]  D. Swanson Undiscovered Public Knowledge , 1986 .

[12]  R. DiGiacomo,et al.  Fish-oil dietary supplementation in patients with Raynaud's phenomenon: a double-blind, controlled, prospective study. , 1989, The American journal of medicine.