Post-processing strategies for improving local gene expression pattern analysis

This paper proposes a new analytical process highlighted by a soft subspace clustering method, a changing window technique, and a series of post-processing strategies to enhance the identification and characterisation of local gene expression patterns. The proposed method can be conducted in an interactive way, facilitating the exploration and analysis of local gene expression patterns in real applications. Experimental results have shown that the proposed method is effective in identification and characterization of functional gene groups in terms of both local expression similarities and biological coherence of genes in a cluster.

[1]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[3]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[4]  Francisco Azuaje,et al.  An integrated tool for microarray data clustering and cluster validity assessment , 2005, Bioinform..

[5]  Huan Liu,et al.  Redundancy based feature selection for microarray data , 2004, KDD.

[6]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[7]  Rasiah Loganantharaj,et al.  Metric for Measuring the Effectiveness of Clustering of DNA Microarray Expression , 2006, BMC Bioinformatics.

[8]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[9]  F. Cross,et al.  G1 cyclins CLN1 and CLN2 repress the mating factor response pathway at Start in the yeast cell cycle. , 1994, Genes & development.

[10]  J. Waddington,et al.  Psychopathology, executive (frontal) and general cognitive impairment in relation to duration of initially untreated versus subsequently treated psychosis in chronic schizophrenia , 1997, Psychological Medicine.

[11]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[12]  Yunming Ye,et al.  A Changing Window Approach to Exploring Gene Expression Patterns , 2008, 2008 IEEE International Conference on Bioinformatics and Biomedicine.

[13]  Michael I. Jordan,et al.  Feature selection for high-dimensional genomic microarray data , 2001, ICML.

[14]  Xiaofeng Gao,et al.  Effectivity of Internal Validation Techniques for Gene Clustering , 2006, ISBMDA.

[15]  E. Wolski,et al.  Normalization strategies for cDNA microarrays. , 2000, Nucleic acids research.

[16]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[17]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[18]  Hua Wang,et al.  Combined Gene Selection Methods for Microarray Data Analysis , 2006, KES.

[19]  Isabelle Tellier,et al.  SSC: statistical subspace clustering , 2005, EGC.

[20]  Susmita Datta,et al.  Cluster Validation for Microarray Data: An Appraisal , 2009 .

[21]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[22]  J. Stuart Aitken,et al.  Feature selection and classification for microarray data analysis: Evolutionary methods for identifying predictive genes , 2005, BMC Bioinformatics.

[23]  I. Dhillon,et al.  Coclustering of Human Cancer Microarrays Using Minimum Sum-Squared Residue Coclustering , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Hans-Peter Kriegel,et al.  Density-Connected Subspace Clustering for High-Dimensional Data , 2004, SDM.

[25]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[26]  Francisco Azuaje,et al.  Machaon CVE: cluster validation for gene expression data , 2003, Bioinform..

[27]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[28]  F. Cross,et al.  Ste12 and Mcm1 regulate cell cycle-dependent transcription of FAR1 , 1996, Molecular and cellular biology.

[29]  Raffaele Giancarlo,et al.  Computational cluster validation for microarray data analysis: experimental assessment of Clest, Consensus Clustering, Figure of Merit, Gap Statistics and Model Explorer , 2008, BMC Bioinformatics.

[30]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[31]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[32]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[33]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.