RaacLogo: a new sequence logo generator by using reduced amino acid clusters

Sequence logos give a fast and concise display in visualizing consensus sequence. Protein exhibits greater complexity and diversity than DNA, which usually affects the graphical representation of the logo. Reduced amino acids perform powerful ability for simplifying complexity of sequence alignment, which motivated us to establish RaacLogo. As a new sequence logo generator by using reduced amino acid alphabets, RaacLogo can easily generate many different simplified logos tailored to users by selecting various reduced amino acid alphabets that consisted of more than 40 clustering algorithms. This current web server provides 74 types of reduced amino acid alphabet, which were manually extracted to generate 673 reduced amino acid clusters (RAACs) for dealing with protein alignment. A two-dimensional selector was proposed for easily selecting desired RAACs with underlying biology knowledge. It is anticipated that the RaacLogo web server will play more high-potential roles for protein sequence alignment, topological estimation and protein design experiments. RaacLogo is freely available at http://bioinfor.imu.edu.cn/raaclogo.

[1]  Yongchun Zuo,et al.  iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition , 2015, PloS one.

[2]  Justin B Kinney,et al.  Logomaker: beautiful sequence logos in Python , 2019, Bioinformatics.

[3]  Christopher M. Dobson,et al.  Kinetic partitioning of protein folding and aggregation , 2002, Nature Structural Biology.

[4]  Qian-zhong Li,et al.  Using reduced amino acid composition to predict defensin family and subfamily: Integrating similarity measure and structural alphabet , 2009, Peptides.

[5]  Donald Hilvert,et al.  An Active Enzyme Constructed from a 9-Amino Acid Alphabet* , 2005, Journal of Biological Chemistry.

[6]  J. C. Phillips,et al.  Fractals and self-organized criticality in proteins , 2014 .

[7]  Jian Huang,et al.  A Brief Survey of Machine Learning Methods in Protein Sub-Golgi Localization , 2019, Current Bioinformatics.

[8]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[9]  Wei Chen,et al.  Predicting protein structural classes for low-similarity sequences by evaluating different features , 2019, Knowl. Based Syst..

[10]  Stephen Freeland,et al.  On the evolution of the standard amino-acid alphabet , 2006, Genome Biology.

[11]  R. Levy,et al.  Simplified amino acid alphabets for protein fold recognition and implications for folding. , 2000, Protein engineering.

[12]  Jun Wang,et al.  A computational approach to simplifying the protein folding alphabet , 1999, Nature Structural Biology.

[13]  Morten Nielsen,et al.  Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion , 2012, Nucleic Acids Res..

[14]  E. Blout,et al.  Polypeptides. LIII. Water‐soluble copolypeptides of L‐glutamic acid, L‐lysine, and L‐alanine , 1967 .

[15]  Guangpeng Li,et al.  PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition , 2017, Bioinform..

[16]  Jiu-Xin Tan,et al.  Identification of hormone binding proteins based on machine learning methods. , 2019, Mathematical biosciences and engineering : MBE.

[17]  Yu Chang,et al.  RAACBook: a web server of reduced amino acid alphabet for sequence-dependent inference by using Chou’s five-step rule , 2019, Database J. Biol. Databases Curation.

[18]  Stephen J. Freeland,et al.  Unearthing the Root of Amino Acid Similarity , 2013, Journal of Molecular Evolution.

[19]  A. G. Brevern,et al.  A reduced amino acid alphabet for understanding and designing protein adaptation to mutation , 2007, European Biophysics Journal.

[20]  H. Chan Folding alphabets , 1999, Nature Structural Biology.

[21]  L. H. Bradley,et al.  Protein design by binary patterning of polar and nonpolar amino acids. , 1993, Methods in molecular biology.

[22]  T. D. Schneider,et al.  Sequence logos: a new way to display consensus sequences. , 1990, Nucleic acids research.

[23]  D. Baker,et al.  Functional rapidly folding proteins from simplified amino acid sequences , 1997, Nature Structural Biology.

[24]  Armando D Solis,et al.  Amino acid alphabet reduction preserves fold information contained in contact interactions in proteins , 2015, Proteins.

[25]  Recent evidence for evolution of the genetic code , 1992 .

[26]  Itay Mayrose,et al.  ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules , 2016, Nucleic Acids Res..

[27]  Yongchun Zuo,et al.  Function determinants of TET proteins: the arrangements of sequence motifs with specific codes , 2019, Briefings Bioinform..

[28]  Omar Wagih,et al.  ggseqlogo: a versatile R package for drawing sequence logos , 2017, Bioinform..