Fuzzy Patterns and GCS Networks to Clustering Gene Expression Data

The advent of DNA microarray technology has supplied a large volume of data to many fields like machine learning and data mining. Gene expression profiles are composed of thousands of genes at the same time, representing complex relationships between them. In this context, intelligent support is essential for managing and interpreting this great amount of information. One of the well-known constraints specifically related to microarray data is the large number of genes in comparison with the small number of available experiments. In this situation, the ability of design methods capable of overcoming current limitations of state-of-the-art algorithms is crucial to the development of successful applications. In this chapter we present a flexible framework for the task of feature selection and classification of microarray data. Dimensionality reduction is achieved by the application of a supervised fuzzy pattern algorithm able to reduce and discretize existing gene expression profiles. An informed growing cell structures network is proposed for clustering biological homogeneous experiments starting from the previous simplified microarray data. Experimental results over different data sets containing acute myeloid leukemia profiles show the effectiveness of the proposed method.

[1]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[2]  Lipo Wang,et al.  Cancer Classification with Microarray Data Using Support Vector Machines , 2005 .

[3]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[4]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[5]  Pragya Agarwal,et al.  Self-Organising Maps , 2008 .

[6]  Juan M. Corchado,et al.  gene‐CBR: A CASE‐BASED REASONIG TOOL FOR CANCER DIAGNOSIS USING MICROARRAY DATA SETS , 2006, Comput. Intell..

[7]  Didier Dubois,et al.  Fuzzy sets and systems ' . Theory and applications , 2007 .

[8]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[9]  Gaolin Zheng,et al.  Neural Network Classifiers and Gene Selection Methods for Microarray Data on Human Lung Adenocarcinoma , 2003 .

[10]  R. Verhaak,et al.  Prognostically useful gene-expression profiles in acute myeloid leukemia. , 2004, The New England journal of medicine.

[11]  Bernd Fritzke Growing self-organizing networks - Why ? , 1996, ESANN.

[12]  David M. Rocke,et al.  Dimension Reduction for Classification with Gene Expression Microarray Data , 2006, Statistical applications in genetics and molecular biology.

[13]  A. Levine,et al.  Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. , 2001, Combinatorial chemistry & high throughput screening.

[14]  Klaus Obermayer,et al.  Feature Selection and Classification on Matrix Data: From Large Margins to Small Covering Numbers , 2002, NIPS.

[15]  Chunru Wan,et al.  Unsupervised gene selection via spectral biclustering , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[16]  ROSA BLANCO,et al.  Gene Selection For Cancer Classification Using Wrapper Approaches , 2004, Int. J. Pattern Recognit. Artif. Intell..

[17]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[18]  Blaise Hanczar,et al.  Improving classification of microarray data using prototype-based feature selection , 2003, SKDD.

[19]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[20]  Ebrahim H. Mamdani,et al.  An Experiment in Linguistic Synthesis with a Fuzzy Logic Controller , 1999, Int. J. Hum. Comput. Stud..

[21]  Michio Sugeno,et al.  Industrial Applications of Fuzzy Control , 1985 .

[22]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[23]  J. M. Deutsch,et al.  Evolutionary algorithms for finding optimal gene sets in microarray prediction , 2003, Bioinform..

[24]  Lipo Wang,et al.  Gene expression data analysis using support vector machines , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[25]  Juan M. Corchado,et al.  Using Fuzzy Patterns for Gene Selection and Data Reduction on Microarray Data , 2006, IDEAL.

[26]  Feng Chu,et al.  Applications of support vector machines to cancer classification with microarray data , 2005, Int. J. Neural Syst..

[27]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[28]  Lotfi A. Zadeh,et al.  Soft computing and fuzzy logic , 1994, IEEE Software.

[29]  A. Godwin,et al.  Microarrays in cancer: research and applications. , 2003, BioTechniques.

[30]  Lakhmi C. Jain,et al.  Bioinformatics using computational intelligence paradigms , 2005 .

[31]  Juan M. Corchado,et al.  Improving Gene Selection in Microarray Data Analysis Using Fuzzy Patterns Inside a CBR System , 2005, ICCBR.

[32]  Walter L. Ruzzo,et al.  Improved Gene Selection for Classification of Microarrays , 2002, Pacific Symposium on Biocomputing.

[33]  Bernd Fritzke,et al.  Growing cell structures--A self-organizing network for unsupervised and supervised learning , 1994, Neural Networks.

[34]  Malek Adjouadi,et al.  Optimizing the classification of acute lymphoblastic leukemia and acute myeloid leukemia samples using artificial neural networks. , 2006, Biomedical sciences instrumentation.