Semantic Subgroup Discovery and Cross-Context Linking for Microarray Data Analysis

The article presents an approach to computational knowledge discovery through the mechanism of bisociation. Bisociative reasoning is at the heart of creative, accidental discovery (e.g., serendipity), and is focused on finding unexpected links by crossing contexts. Contextualization and linking between highly diverse and distributed data and knowledge sources is therefore crucial for the implementation of bisociative reasoning. In the article we explore these ideas on the problem of analysis of microarray data. We show how enriched gene sets are found by using ontology information as background knowledge in semantic subgroup discovery. These genes are then contextualized by the computation of probabilistic links to diverse bioinformatics resources. Preliminary experiments with microarray data illustrate the approach.

[1]  A. Koestler The Act of Creation , 1964 .

[2]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[3]  Marc Weeber,et al.  Using concepts in literature-based discovery: simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001 .

[4]  Marc Weeber,et al.  Using concepts in literature-based discovery: Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries , 2001, J. Assoc. Inf. Sci. Technol..

[5]  Nada Lavrac,et al.  Expert-Guided Subgroup Discovery: Methodology and Application , 2011, J. Artif. Intell. Res..

[6]  Nada Lavrac,et al.  Induction of comprehensible models for gene expression datasets by subgroup discovery methodology , 2004, J. Biomed. Informatics.

[7]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[8]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Hannu Toivonen,et al.  Link Discovery in Graphs Derived from Biological Databases , 2006, DILS.

[10]  Neil R. Smalheiser,et al.  Ranking indirect connections in literature-based discovery: The role of medical subject headings: Research Articles , 2006 .

[11]  Nada Lavrac,et al.  Propositionalization-based relational subgroup discovery with RSD , 2006, Machine Learning.

[12]  Neil R. Smalheiser,et al.  Ranking indirect connections in literature-based discovery: The role of medical subject headings , 2006, J. Assoc. Inf. Sci. Technol..

[13]  Nada Lavrac,et al.  Learning Relational Descriptions of Differentially Expressed Gene Groups , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[14]  Nada Lavrac,et al.  SEGS: Search for enriched gene sets in microarray data , 2008, J. Biomed. Informatics.

[15]  V. Podpecan,et al.  Constructing Information Networks from Text Documents , 2009 .

[16]  Tanja Urbancic,et al.  Literature mining method RaJoLink for uncovering relations between biomedical concepts , 2009, J. Biomed. Informatics.

[17]  Nada Lavrac,et al.  SegMine workflows for semantic microarray data analysis in Orange4WS , 2011, BMC Bioinformatics.

[18]  Michael R. Berthold Bisociative Knowledge Discovery , 2011, IDA.

[19]  Hannu Toivonen,et al.  Biomine: A Network-Structured Resource of Biological Entities for Link Prediction , 2012, Bisociative Knowledge Discovery.

[20]  Tobias Kötter,et al.  Towards Creative Information Exploration Based on Koestler's Concept of Bisociation , 2012, Bisociative Knowledge Discovery.