Finding motifs using harmony search

The paper proposes a novel methodology for finding motifs of biological data. It uses music inspired meta-heuristic optimization technique called harmony search to find motif. The model is based on randomly generated l-mers as the initial harmony memory. Pitch adjustment and random selection are used to generate new l-mers, which are adjudged by a specially defined objective function. The proposed method is experimentally validated using sequences of Human Papillomavirus strains obtained from accredited and authorized sources.

[1]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[2]  Shoudan Liang,et al.  cWINNOWER algorithm for finding fuzzy DNA motifs , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[3]  Graziano Pesole,et al.  Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes , 2004, Nucleic Acids Res..

[4]  A. A. Reilly,et al.  An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences , 1990, Proteins.

[5]  M. Hemalatha,et al.  Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences , 2008 .

[6]  Ajay N. Jain,et al.  A deterministic motif finding algorithm with application to the human genome , 2006, Bioinform..

[7]  Douglas L. Brutlag,et al.  BioProspector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes , 2000, Pacific Symposium on Biocomputing.

[8]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[9]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[10]  Zong Woo Geem,et al.  Application of Harmony Search to Vehicle Routing , 2005 .

[11]  Erik van Nimwegen,et al.  PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates Phylogeny , 2005, PLoS Comput. Biol..

[12]  Gary D. Stormo,et al.  Identifying DNA and protein patterns with statistically significant alignments of multiple sequences , 1999, Bioinform..

[13]  Zong Woo Geem,et al.  A New Heuristic Optimization Algorithm: Harmony Search , 2001, Simul..

[14]  M. Hemalatha,et al.  Genetic Algorithm Based Probabilistic Motif Discovery in Multiple Unaligned Biological Sequences , 2008 .

[15]  P. D’haeseleer What are DNA sequence motifs? , 2006, Nature Biotechnology.

[16]  M. Tompa,et al.  Discovery of novel transcription factor binding sites by statistical overrepresentation. , 2002, Nucleic acids research.

[17]  Saifuddin Md. Tareeq,et al.  ANT: A Novel Heuristic Algorithm for Finding Motif , 2007 .

[18]  Graziano Pesole,et al.  An algorithm for finding signals of unknown length in DNA sequences , 2001, ISMB.

[19]  Jun S. Liu,et al.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. , 1993, Science.

[20]  V. Bajic,et al.  A HYBRID ALGORITHM FOR MOTIF DISCOVERY FROM DNA SEQUENCES , 2004 .

[21]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[22]  Motif Finding in Biological Sequences , 2005 .

[23]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[24]  Jianhua Ruan,et al.  A Particle Swarm Optimization algorithm for finding DNA sequence motifs , 2008, 2008 IEEE International Conference on Bioinformatics and Biomeidcine Workshops.

[25]  Nan Li,et al.  Analysis of computational approaches for motif discovery , 2006, Algorithms for Molecular Biology.

[26]  Bin Li,et al.  Limitations and potentials of current motif discovery algorithms , 2005, Nucleic acids research.

[27]  T. Hosoya,et al.  The gcm-motif: a novel DNA-binding motif conserved in Drosophila and mammals. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Saurabh Sinha,et al.  YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation , 2003, Nucleic Acids Res..

[29]  Uri Keich,et al.  Finding motifs in the twilight zone , 2002, RECOMB '02.