Effects of windowing and zero-padding on Complex Resonant Recognition Model for protein sequence analysis

Signal processing techniques such as Fourier Transform have widely been studied and successfully applied in many different areas. Techniques such as zero-padding and windowing have been developed and found very useful to improve the outcome of the signal processing methods. Resonant Recognition Model (RRM) and Complex Resonant Recognition Model (CRRM) that are based on the discrete Fourier Transform and widely used for the analysis of protein sequences do not consider such methods, which can however improve or alter the features extracted from the protein sequences. Therefore, in this paper, an extensive analysis was carried out to investigate into the influence of the zero-padding and windowing on the features extracted from the Complex Resonant Recognition Model. In order to present such effects, five different classes of influenza A virus Neuraminidase genes, which include H1N1, H1N2, H2N2, H3N2 and H5N1 genes, were used as a case study. For each of the Influenza A subtypes, two sets of Common Frequency Peaks (CFP) were extracted, one where windowing is applied and the other one where windowing is suppressed, for each signal length set for the analysis. In order to make all the signals (protein sequence) the same length, zero-padding was used. The signal lengths used in this study are set to 470, which is the maximum protein length, and also 512, 1024, 2048, 4096, 8192 and 16384 for further analysis. The results suggest that the windowing and zero-padding have key impact on CFP extracted from the Influenza A subtypes as the best match with CFP extracted from influenza A subtypes using CRRM is when the signal length of 4096 and windowing were both applied. Therefore, the outcome of this study should be taken into consideration for more accurate and reliable analysis of the protein sequences.

[1]  A. Fauci,et al.  The persistent legacy of the 1918 influenza virus. , 2009, The New England journal of medicine.

[2]  L. Jiang,et al.  PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[3]  F. Harris On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[4]  Jianguo Wu,et al.  Origin of highly pathogenic H5N1 avian influenza virus in China and genetic characterization of donor and recipient viruses. , 2007, The Journal of general virology.

[5]  Adly Girgis,et al.  A Quantitative Study of Pitfalls in the FFT , 1980, IEEE Transactions on Aerospace and Electronic Systems.

[6]  I. Cosic Macromolecular bioactivity: is it resonant interaction between macromolecules?-theory and applications , 1994, IEEE Transactions on Biomedical Engineering.

[7]  A. Moscona Neuraminidase inhibitors for influenza. , 2005, The New England journal of medicine.

[8]  J. W. Tukey,et al.  The Measurement of Power Spectra from the Point of View of Communications Engineering , 1958 .

[9]  I. Cosic,et al.  Is it Possible to Analyze DNA and Protein Sequences by the Methods of Digital Signal Processing? , 1985, IEEE Transactions on Biomedical Engineering.

[10]  Minoru Kanehisa,et al.  AAindex: amino acid index database, progress report 2008 , 2007, Nucleic Acids Res..

[11]  Young Ki Choi,et al.  Phylogenetic analysis of H1N2 isolates of influenza A virus from pigs in the United States. , 2002, Virus research.

[12]  T. Tatusova,et al.  The Influenza Virus Resource at the National Center for Biotechnology Information , 2007, Journal of Virology.

[13]  Z. R. Li,et al.  Update of PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence , 2006, Nucleic Acids Res..

[14]  P. Haris,et al.  Complex Resonant Recognition Model in analysing Influenza a virus subtype protein sequences , 2010, Proceedings of the 10th IEEE International Conference on Information Technology and Applications in Biomedicine.

[15]  D. Agrez Improving phase estimation with leakage minimization , 2005, IEEE Transactions on Instrumentation and Measurement.

[16]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.