Fuzzy Bayesian validation for cluster analysis of yeast cell-cycle data

Clustering for the analysis of the genes organizes the patterns into groups by the similarity of the dataset and has been used for identifying the functions of the genes in the cluster and analyzing the functions of unknown genes. Since the genes usually belong to multiple functional families, fuzzy clustering methods are more appropriate than the conventional hard clustering methods which assign a sample to only one group. In this paper, a Bayesian-like validation method selecting a fuzzy partition is proposed to evaluate the fuzzy partitions effectively. The theoretical interpretation of the obtained memberships is beyond the scope of this paper, and an empirical evaluation of the proposed method is conducted by comparing to the four representative conventional fuzzy cluster validity measures in four well-known datasets. Analysis of yeast cell-cycle data follows to evaluate the proposed method.

[1]  Y. Fukuyama,et al.  A new method of choosing the number of clusters for the fuzzy c-mean method , 1989 .

[2]  J. Bezdek Numerical taxonomy with fuzzy sets , 1974 .

[3]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[4]  Francisco Azuaje,et al.  Cluster validation techniques for genome expression data , 2003, Signal Process..

[5]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[6]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[7]  James C. Bezdek,et al.  On cluster validity for the fuzzy c-means model , 1995, IEEE Trans. Fuzzy Syst..

[8]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[9]  A. Aderem Systems Biology: Its Practice and Challenges , 2005, Cell.

[10]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Nir Friedman,et al.  Context-Specific Bayesian Clustering for Gene Expression Data , 2002, J. Comput. Biol..

[12]  Boudewijn P. F. Lelieveldt,et al.  A new cluster validity index for the fuzzy c-mean , 1998, Pattern Recognit. Lett..

[13]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[14]  M. Eisen,et al.  Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering , 2002, Genome Biology.

[15]  J. C. Peters,et al.  Fuzzy Cluster Analysis : A New Method to Predict Future Cardiac Events in Patients With Positive Stress Tests , 1998 .

[16]  Miin-Shen Yang,et al.  A cluster validity index for fuzzy clustering , 2005, Pattern Recognit. Lett..

[17]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[18]  Ka Yee Yeung,et al.  Validating clustering for gene expression data , 2001, Bioinform..

[19]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[20]  Kenneth G. Manton,et al.  Fuzzy Cluster Analysis , 2005 .

[21]  Doheon Lee,et al.  Fuzzy cluster validation index based on inter-cluster proximity , 2003, Pattern Recognit. Lett..

[22]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.