A novel algorithm for generating simulated genetic data based on K-medoids

Genetic data is very important for biological research, but it is hard to be obtained by experiment. In this paper, we introduce an algorithm for generating simulated genetic data based on K-mediods. A concept of Cluster Channel is proposed in this algorithm and used to generate simulated data. The noise of origin data could be eliminated using the proposed method. The experimental results show reliability of simulated genetic data. SAM is used to analyze the simulated data and original data, and we get a conclusion that the simulated data can effectively validate differentially expressed gene detected algorithm.

[1]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Michael W. Bruford,et al.  A new method for quantifying genotyping errors for noninvasive genetic studies , 2010, Conservation Genetics.

[3]  Yi Wang,et al.  Novel statistical framework to identify differentially expressed genes allowing transcriptomic background differences , 2010, Bioinform..

[4]  R. Krishnapuram,et al.  A fuzzy relative of the k-medoids algorithm with application to web document and snippet clustering , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[5]  Eric E. Schadt,et al.  Processing Large-Scale, High-Dimension Genetic and Gene Expression Data , 2009 .

[6]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[7]  Chunhai Fan,et al.  Electrochemical interrogation of conformational changes as a reagentless method for the sequence-specific detection of DNA , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Barbara L Taylor,et al.  Applied conservation genetics and the need for quality control and reporting of genetic data used in fisheries and wildlife management. , 2010, The Journal of heredity.

[9]  A. Shimamura,et al.  Bridging Psychological and Biological Science: The Good, Bad, and Ugly , 2010, Perspectives on psychological science : a journal of the Association for Psychological Science.