Gene expression data analysis using closed item set mining for labeled data.

This article presents an approach to microarray data analysis using discretised expression values in combination with a methodology of closed item set mining for class labeled data (RelSets). A statistical 2 x 2 factorial design analysis was run in parallel. The approach was validated on two independent sets of two-color microarray experiments using potato plants. Our results demonstrate that the two different analytical procedures, applied on the same data, are adequate for solving two different biological questions being asked. Statistical analysis is appropriate if an overview of the consequences of treatments and their interaction terms on the studied system is needed. If, on the other hand, a list of genes whose expression (upregulation or downregulation) differentiates between classes of data is required, the use of the RelSets algorithm is preferred. The used algorithms are freely available upon request to the authors.

[1]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[2]  Thomas D. Wu,et al.  Analysing gene expression data from DNA microarrays to identify candidate genes , 2001, The Journal of pathology.

[3]  A. Rotter,et al.  Adaptation of the MapMan ontology to biotic stress responses: application in solanaceous species , 2007, Plant Methods.

[4]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[5]  D. Allison,et al.  Microarray data analysis: from disarray to consolidation and consensus , 2006, Nature Reviews Genetics.

[6]  B Kovalerchuk,et al.  Consistent knowledge discovery in medical diagnosis. , 2000, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[7]  Lorenz Wernisch,et al.  Analysis of whole-genome microarray replicates using mixed models , 2003, Bioinform..

[8]  M K Kerr,et al.  Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Claudio Carpineto,et al.  Concept data analysis - theory and applications , 2004 .

[10]  A. Rotter,et al.  Finding differentially expressed genes in two-channel DNA microarray datasets: how to increase reliability of data preprocessing. , 2008, Omics : a journal of integrative biology.

[11]  Gordon K. Smyth,et al.  Use of within-array replicate spots for assessing differential expression in microarray experiments , 2005, Bioinform..

[12]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Mei-Ling Ting Lee,et al.  Analysis of Microarray Gene Expression Data , 2004, Springer US.

[14]  T. Pham,et al.  Analysis of Microarray Gene Expression Data , 2006 .

[15]  Nada Lavrac,et al.  A Study of Relevance for Learning in Deductive Databases , 1999, J. Log. Program..

[16]  Atul Butte,et al.  The use and analysis of microarray data , 2002, Nature Reviews Drug Discovery.

[17]  Nada Lavrac,et al.  Relevancy in Constraint-Based Subgroup Discovery , 2004, Constraint-Based Mining and Inductive Databases.

[18]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[19]  Nada Lavrac,et al.  Closed Sets for Labeled Data , 2006, PKDD.