Data Analysis of Arabidopsis Tiling Array

DNA tiling microarray technology has become a major bioinformatics tool for genomic research. Due to the high-density, high-throughput characteristics, tiling array can help to study gene expression and to explore the mystery of life from genome level. However, due to its data volume and complexity, the analysis of tiling array data is not streamlined yet. Although some dynamic programming approaches have been successfully applied to yeast tiling array data, the segmentation problem is considerably more challenging for the genomes of higher eukaryotes, such as Arabidopsis. In this paper, we applied a new machine learning method combining the advantages of Hidden Markov (HM) models and Support Vector Machines (SVM) to deal with the Arabidopsis tiling array data by adopting the probe filtering and normalization of wild type samples to identify gene structures.

[1]  Gunnar Rätsch,et al.  Transcript Normalization and Segmentation of Tiling Array Data , 2007, Pacific Symposium on Biocomputing.

[2]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[3]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[4]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[5]  Jun Wang,et al.  The research progress of tiling array technology and applications , 2008 .

[6]  Wolfgang Huber,et al.  Transcript mapping with high-density oligonucleotide tiling arrays , 2006, Bioinform..

[7]  Gunnar Rätsch,et al.  At-TAX: a whole genome tiling array resource for developmental expression analysis and transcript identification in Arabidopsis thaliana , 2008, Genome Biology.

[8]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[9]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[10]  Hui Jiang,et al.  Large margin HMMs for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  John Quackenbush,et al.  Open source software for the analysis of microarray data. , 2003, BioTechniques.

[12]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.