Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics

Detecting genetic interactions without running an exhaustive search is a difficult problem. We present a new heuristic, multiSURF*, which can detect these interactions with high accuracy and in time linear in the number of genes. Our algorithm is an improvement over the SURF* algorithm, which detects genetic signals by comparing individuals close to, and far from, one another and noticing whether differences correlate with different disease statuses. Our improvement consistently outperforms SURF* while providing a large runtime decrease by examining only individuals very near and very far from one another. Additionally we perform an analysis on real data and show that our method provides new information. We conclude that multiSURF* is a better alternative to SURF* in both power and runtime.

[1]  Carolyn J. Mattingly,et al.  Preliminary Results for GAMI: A Genetic Algorithms Approach to Motif Inference , 2005, 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[2]  Dervis Karaboga,et al.  AN IDEA BASED ON HONEY BEE SWARM FOR NUMERICAL OPTIMIZATION , 2005 .

[3]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[4]  Ajith Abraham,et al.  Hybrid Evolutionary Algorithms: Methodologies, Architectures, and Reviews , 2007 .

[5]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[6]  Dipankar Dasgupta,et al.  Motif discovery in upstream sequences of coordinately expressed genes , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..

[7]  Hitoshi Iba,et al.  Identification of weak motifs in multiple biological sequences using genetic algorithm , 2006, GECCO.

[8]  D. Clayton,et al.  Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.

[9]  A. Rubio-Largo,et al.  MO-ABC/DE - Multiobjective Artificial Bee Colony with Differential Evolution for unconstrained multiobjective optimization , 2012, 2012 IEEE 13th International Symposium on Computational Intelligence and Informatics (CINTI).

[10]  R. K. Ursem Multi-objective Optimization using Evolutionary Algorithms , 2009 .

[11]  G. Fogel,et al.  Discovery of sequence motifs related to coexpression of genes using evolutionary computation. , 2004, Nucleic acids research.

[12]  Gary B. Fogel,et al.  Evolutionary computation for discovery of composite transcription factor binding sites , 2008, Nucleic acids research.

[13]  Rong-Ming Chen,et al.  FMGA: finding motifs by genetic algorithm , 2004, Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering.

[14]  Khaled Rasheed,et al.  MDGA: motif discovery using a genetic algorithm , 2005, GECCO '05.

[15]  Miguel A. Vega-Rodríguez,et al.  Predicting DNA Motifs by Using Evolutionary Multiobjective Optimization , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  P. D’haeseleer What are DNA sequence motifs? , 2006, Nature Biotechnology.

[17]  Miguel A. Vega-Rodríguez,et al.  Comparing multiobjective swarm intelligence metaheuristics for DNA motif discovery , 2013, Eng. Appl. Artif. Intell..

[18]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[19]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[20]  Lothar Thiele,et al.  Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[21]  Yuehui Chen,et al.  Bacterial Foraging Optimization Algorithm Integrating Tabu Search for Motif Discovery , 2009, 2009 IEEE International Conference on Bioinformatics and Biomedicine.

[22]  Scott M. Williams,et al.  A balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction , 2007, Genetic epidemiology.

[23]  Mehmet Kaya,et al.  MOGAMOD: Multi-objective genetic algorithm for motif discovery , 2009, Expert Syst. Appl..

[24]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[25]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[26]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .