Filter-Based Methodology for the Location of Hot Spots in Proteins and Exons in DNA

The so-called receiver operating characteristic technique is used as a tool in an optimization procedure for the improvement and assessment of a filter-based methodology for the location of hot spots in protein sequences and exons in DNA sequences. By optimizing the characteristic values of the nucleotides, high efficiency as well as improved accuracy can be achieved relative to results obtained with the electron-ion interaction potentials. On the other hand, by using the proposed filter-based methodology with binary sequences, improved accuracy can be achieved although the efficiency is somewhat compromised relative to that achieved using the optimized characteristic values. Extensive experimental results, evaluated using measures such as the g-mean, the Matthews correlation coefficient, and the chi-square statistic, show that the filter-based methodology performs much better than existing techniques using the short-time discrete Fourier transform, particularly in applications where short exons are involved.

[1]  W. Delano Unraveling hot spots in binding interfaces: progress and challenges. , 2002, Current opinion in structural biology.

[2]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[3]  J. Wells,et al.  Systematic mutational analyses of protein-protein interfaces. , 1991, Methods in enzymology.

[4]  Andreas Antoniou,et al.  Identification of Hot-Spot Locations in Proteins Using Digital Filters , 2008, IEEE Journal of Selected Topics in Signal Processing.

[5]  Andreas Antoniou,et al.  Identification of tubulin drug binding sites and prediction of relative differences in binding affinities to tubulin isotypes using digital signal processing. , 2008, Journal of molecular graphics & modelling.

[6]  Andreas Antoniou,et al.  Optimized numerical mapping scheme for filter-based exon location in DNA using a quasi-Newton algorithm , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[7]  Luhua Lai,et al.  Structure-based method for analyzing protein–protein interfaces , 2004, Journal of molecular modeling.

[8]  Terry L King A Guide to Chi-Squared Testing , 1997 .

[9]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[10]  A. Nair,et al.  A coding measure scheme employing electron-ion interaction pseudopotential (EIIP) , 2006, Bioinformation.

[11]  I. Cosic,et al.  Is it Possible to Analyze DNA and Protein Sequences by the Methods of Digital Signal Processing? , 1985, IEEE Transactions on Biomedical Engineering.

[12]  Andreas Antoniou,et al.  Practical Optimization: Algorithms and Engineering Applications , 2007, Texts in Computer Science.

[13]  A. Antoniou Digital Signal Processing: Signals, Systems, and Filters , 2005 .

[14]  S. Tiwari,et al.  Prediction of probable genes by Fourier analysis of genomic sequences , 1997, Comput. Appl. Biosci..

[15]  Burkhard Rost,et al.  Protein–Protein Interaction Hotspots Carved into Sequences , 2007, PLoS Comput. Biol..

[16]  R. Voss,et al.  Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. , 1992, Physical review letters.

[17]  R. Guigó,et al.  Evaluation of gene structure prediction programs. , 1996, Genomics.

[18]  A. Antoniou,et al.  Identification and location of hot spots in proteins using the short-time discrete Fourier transform , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[19]  Kurt S. Thorn,et al.  ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions , 2001, Bioinform..

[20]  Doheon Lee,et al.  A feature-based approach to modeling protein–protein interaction hot spots , 2009, Nucleic acids research.

[21]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[22]  Xing-Ming Zhao,et al.  APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility , 2010, BMC Bioinformatics.

[23]  Steven Salzberg,et al.  Identifying bacterial genes and endosymbiont DNA with Glimmer , 2007, Bioinform..

[24]  Parameswaran Ramachandran,et al.  New Techniques for the Location of Hot Spots in Proteins and Exons in DNA Using Digital Filters , 2010 .

[25]  Ozlem Keskin,et al.  Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy , 2009, Bioinform..

[26]  A. Bogan,et al.  Anatomy of hot spots in protein interfaces. , 1998, Journal of molecular biology.

[27]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[28]  J. Tuqan,et al.  A DSP perspective to the period-3 detection problem , 2006, 2006 IEEE International Workshop on Genomic Signal Processing and Statistics.

[29]  I. Cosic Macromolecular bioactivity: is it resonant interaction between macromolecules?-theory and applications , 1994, IEEE Transactions on Biomedical Engineering.

[30]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[31]  Yanchun Liang,et al.  An artificial neural network method for combining gene prediction based on equitable weights , 2008, Neurocomputing.

[32]  D. Baker,et al.  A simple physical model for binding energy hot spots in protein–protein complexes , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Massimiliano Pontil,et al.  Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods , 2009, BMC Bioinformatics.

[34]  Andreas Antoniou,et al.  Tuning technique for the location of hot spots in proteins using a bandpass notch digital filter , 2009, 2009 IEEE International Workshop on Genomic Signal Processing and Statistics.

[35]  Andreas Antoniou,et al.  Location of exons in DNA sequences using digital filters , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[36]  Jamal Tuqan,et al.  A DSP Approach for Finding the Codon Bias in DNA Sequences , 2008, IEEE Journal of Selected Topics in Signal Processing.

[37]  Wolfgang Wenzel,et al.  Probing hot spots on protein-protein interfaces with all-atom free-energy simulation. , 2009, The Journal of chemical physics.

[38]  Ozlem Keskin,et al.  HotPoint: hot spot prediction server for protein interfaces , 2010, Nucleic Acids Res..

[39]  P.P. Vaidyanathan,et al.  Digital filters for gene prediction applications , 2002, Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002..

[40]  Julie C. Mitchell,et al.  KFC Server: interactive forecasting of protein interaction hot spots , 2008, Nucleic Acids Res..

[41]  Dimitris Anastassiou,et al.  Frequency-domain analysis of biomolecular sequences , 2000, Bioinform..

[42]  Yizhar Lavner,et al.  Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. , 2003, Genome research.

[43]  P. Ramachandran,et al.  Localization of Hot Spots in Proteins Using Digital Filters , 2006, 2006 IEEE International Symposium on Signal Processing and Information Technology.

[44]  Alan K. Mackworth,et al.  Evaluation of gene-finding programs on mammalian sequences. , 2001, Genome research.

[45]  Peter A. Kollman,et al.  Computational alanine scanning of the 1:1 human growth hormone–receptor complex , 2002, J. Comput. Chem..

[46]  Andreas Antoniou,et al.  Improved hot-spot location technique for proteins using a bandpass notch digital filter , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[47]  Amir Asif,et al.  Prediction of protein coding regions in DNA sequences using Fourier spectral characteristics , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[48]  James W. Fickett,et al.  The Gene Identification Problem: An Overview for Developers , 1995, Comput. Chem..

[49]  S. Vajda,et al.  Anchor residues in protein-protein interactions. , 2004, Proceedings of the National Academy of Sciences of the United States of America.