A New Distance-based Approach for Phylogenetic Analysis of Protein Sequences

With the availability of ever-increasing gene and protein sequence data across a large number of species, reconstruction of phylogenetic trees to reveal the evolutionary relationship among those species becomes more and more important. In this paper, we take the physicochemical properties of amino acids into account and introduce the protein feature sequences into phylogenetic analysis by using the Bhattacharyya distance. The phylogenetic trees on the two data sets have illustrated that the proposed approach performs equally well as the other methods do and is more efficient than some of the methods. So our method may be used to complement phylogenetic analysis.

[1]  Florence Corpet,et al.  RNAlign program: alignment of RNA sequences using both primary and secondary structures , 1994, Comput. Appl. Biosci..

[2]  S Karlin,et al.  Comparisons of eukaryotic genomic sequences. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Guo-Ping Zhao,et al.  Cross-host evolution of severe acute respiratory syndrome coronavirus in palm civet and human. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Kaizhong Zhang,et al.  Comparing multiple RNA secondary structures using tree comparisons , 1990, Comput. Appl. Biosci..

[5]  E. Diamandis,et al.  In silico identification and Bayesian phylogenetic analysis of multiple new mammalian kallikrein gene families. , 2006, Genomics.

[6]  Tianming Wang,et al.  A Complexity-based Method to Compare RNA Secondary Structures and its Application , 2010, Journal of biomolecular structure & dynamics.

[7]  Chuan Yi Tang,et al.  Improving prediction accuracy for protein structure classification by neural network using feature combination , 2005 .

[8]  R. I. Mubark,et al.  Different Species Classifier and Hemoglobin Structure Predictor based on DNA Sequences , 2008 .

[9]  Tianming Wang,et al.  On Graphical and Numerical Representation of Protein Sequences , 2006, Journal of biomolecular structure & dynamics.

[10]  X. Gu,et al.  A simple method for phylogenomic inference using the information of gene content of genomes. , 2009, Gene.

[11]  Tianming Wang,et al.  Protein‐based phylogenetic analysis by using hydropathy profile of amino acids , 2006, FEBS letters.

[12]  Bo Liao,et al.  A Method for Constructing Phylogenetic Tree Based on a Dissimilarity Matrix , 2010 .

[13]  Yu-hua Yao,et al.  A 2D graphical representation of RNA secondary structures and the analysis of similarity/dissimilarity based on it , 2005 .

[14]  Shengli Zhang,et al.  Feature analysis of protein structure by using discrete Fourier transform and continuous wavelet transform , 2009 .

[15]  Abdel-Badeeh M. Salem,et al.  PSISA: an algorithm for indexing and searching protein structure using suffix arrays , 2008 .

[16]  Hiroshi Tanaka,et al.  A likelihood look at the supermatrix-supertree controversy. , 2009, Gene.

[17]  Application of computational modelling to protein folding and aggregation studies , 2009 .

[18]  Zu-Guo Yu,et al.  Chaos game representation of protein sequences based on the detailed HP model and their multifractal and correlation analyses. , 2004, Journal of theoretical biology.

[19]  Khalid Sayood,et al.  Utilization of the relative complexity measure to construct a phylogenetic tree for fungi. , 2004, Mycological research.

[20]  Lianping Yang,et al.  Use of information discrepancy measure to compare protein secondary structures , 2009 .

[21]  Tianming Wang,et al.  Analysis of protein sequences and their secondary structures based on transition matrices , 2007 .

[22]  Peter F. Stadler,et al.  Alignment of RNA base pairing probability matrices , 2004, Bioinform..

[23]  Feng Gao,et al.  Coronavirus phylogeny based on a geometric approach , 2005, Molecular Phylogenetics and Evolution.

[24]  J. Podani,et al.  BOOL-AN: a method for comparative sequence analysis and phylogenetic reconstruction. , 2009, Molecular phylogenetics and evolution.

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  R. I. Mubark,et al.  Different Species and Proteins Classifiers and Protein's Structure Predictors Systems , 2008 .

[27]  R. Ravi,et al.  Computing Similarity between RNA Strings , 1996, CPM.

[28]  A. Bhattacharyya On a measure of divergence between two statistical populations defined by their probability distributions , 1943 .

[29]  Xiangde Zhang,et al.  Use of the Burrows–Wheeler similarity distribution to the comparison of the proteins , 2010, Amino Acids.

[30]  Lai Khin Wee International Journal of Biology and Biomedical Engineering , 2012 .

[31]  Chun‐Liang Lin,et al.  System Identification and Control Using DNA Computing Algorithms , 2022 .

[32]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[33]  Renfa Li,et al.  A group of 3D graphical representation of DNA sequences based on dual nucleotides , 2008 .

[34]  Yu-Hua Yao,et al.  A class of 2D graphical representations of RNA secondary structures and the analysis of similarity based on them , 2005, J. Comput. Chem..

[35]  Laurent Tichit,et al.  RNA secondary structure comparison: exact analysis of the Zhang-Shasha tree edit algorithm , 2003, Theor. Comput. Sci..

[36]  Tianming Wang,et al.  Phylogenetic Analysis of Protein Sequences Based on Conditional LZ Complexity , 2010 .

[37]  J. Qi,et al.  Whole Proteome Prokaryote Phylogeny Without Sequence Alignment: A K-String Composition Approach , 2003, Journal of Molecular Evolution.

[38]  Anup Som,et al.  ML or NJ-MCL? A comparison between two robust phylogenetic methods , 2009, Comput. Biol. Chem..

[39]  Shu-Cherng Fang,et al.  A tabu search algorithm for maximum parsimony phylogeny inference , 2007, Eur. J. Oper. Res..

[40]  Bo Liao,et al.  A Vertical and Horizontal Method for Constructing Phylogenetic Tree , 2010 .

[41]  M. Ford,et al.  Molecular evolution of transferrin: evidence for positive selection in salmonids. , 2001, Molecular biology and evolution.

[42]  Wen Zhu,et al.  A condensed 3D graphical representation of RNA secondary structures , 2005 .