MMM-QSAR Recognition of Ribonucleases without Alignment: Comparison with an HMM Model and Isolation from Schizosaccharomyces pombe, Prediction, and Experimental Assay of a New Sequence

The study of type III RNases constitutes an important area in molecular biology. It is known that the pac1+ gene encodes a particular RNase III that shares low amino acid similarity with other genes despite having a double-stranded ribonuclease activity. Bioinformatics methods based on sequence alignment may fail when there is a low amino acidic identity percentage between a query sequence and others with similar functions (remote homologues) or a similar sequence is not recorded in the database. Quantitative structure-activity relationships (QSAR) applied to protein sequences may allow an alignment-independent prediction of protein function. These sequences of QSAR-like methods often use 1D sequence numerical parameters as the input to seek sequence-function relationships. However, previous 2D representation of sequences may uncover useful higher-order information. In the work described here we calculated for the first time the spectral moments of a Markov matrix (MMM) associated with a 2D-HP-map of a protein sequence. We used MMMs values to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. We subsequently developed one MMM-QSAR and one classic hidden Markov model (HMM) based on the same data. The MMM-QSAR showed a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. We also report for the first time the isolation of a new Pac1 protein (DQ647826) from Schizosaccharomyces pombe strain 428-4-1. The MMM-QSAR model predicts the new RNase III with the same accuracy as other classical alignment methods. Experimental assay of this protein confirms the predicted activity. The present results suggest that MMM-QSAR models may be used for protein function annotation avoiding sequence alignment with the same accuracy of classic HMM models.

[1]  Alex Bateman,et al.  HMM-based databases in InterPro , 2002, Briefings Bioinform..

[2]  K. Chou Prediction of human immunodeficiency virus protease cleavage sites in proteins. , 1996, Analytical biochemistry.

[3]  J. Gálvez,et al.  Molecular search of new active drugs against Toxoplasma gondii. , 1999, SAR and QSAR in environmental research.

[4]  A. Nandy Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences , 1996, Comput. Appl. Biosci..

[5]  F. Studier,et al.  Early RNAs and Escherichia coli Ribosomal RNAs are Cut from Large Precursor RNAs In Vivo by Ribonuclease III , 2022 .

[6]  V. Kim,et al.  The Drosha-DGCR8 complex in primary microRNA processing. , 2004, Genes & development.

[7]  Milan Randic,et al.  On the Similarity of DNA Primary Sequences , 2000, J. Chem. Inf. Comput. Sci..

[8]  Qianzhong Li,et al.  Using pseudo amino acid composition to predict protein structural class: Approached by incorporating 400 dipeptide components , 2007, J. Comput. Chem..

[9]  Uwe Kärst,et al.  MineBlast: a literature presentation service supporting protein annotation by data mining of BLAST results , 2005, Bioinform..

[10]  S.-W. Zhang,et al.  Prediction of protein subcellular localization by support vector machines using multi-scale energy and pseudo amino acid composition , 2007, Amino Acids.

[11]  Li Yang,et al.  New invariant of DNA sequence based on 3DD‐curves and its application on phylogeny , 2007, J. Comput. Chem..

[12]  Kara Dolinski,et al.  Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD) , 2004, Nucleic Acids Res..

[13]  Weiqun Wang,et al.  New 2-D graphical representation of DNA sequences , 2006 .

[14]  Z. Huang,et al.  Using complexity measure factor to predict protein subcellular location , 2005, Amino Acids.

[15]  Yovani Marrero-Ponce,et al.  A linear discrimination analysis based virtual screening of trichomonacidal lead-like compounds: outcomes of in silico studies supported by experimental results. , 2005, Bioorganic & medicinal chemistry letters.

[16]  Jin Xu,et al.  Some Notes on 2-D Graphical Representation of DNA Sequence , 2002, J. Chem. Inf. Comput. Sci..

[17]  K. Chou,et al.  A sequence‐coupled vector‐projection model for predicting the specificity of GalNAc‐transferase , 1995, Protein science : a publication of the Protein Society.

[18]  G. Hannon,et al.  RNase III enzymes and the initiation of gene silencing , 2004, Nature Structural &Molecular Biology.

[19]  Kuo-Chen Chou,et al.  HIV-1 gp120 V3 loop for structure-based drug design. , 2005, Current protein & peptide science.

[20]  B. Henrissat,et al.  Detection of secondary structure elements in proteins by hydrophobic cluster analysis. , 1992, Protein engineering.

[21]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[22]  Francisco Torrens,et al.  Protein linear indices of the 'macromolecular pseudograph alpha-carbon atom adjacency matrix' in bioinformatics. Part 1: prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor. , 2005, Bioorganic & medicinal chemistry.

[23]  Zhanchao Li,et al.  Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes. , 2007, Journal of theoretical biology.

[24]  Priyanka D Abeyrathne,et al.  Parallels in rRNA processing: conserved features in the processing of the internal transcribed spacer 1 in the pre-rRNA from Schizosaccharomyces pombe. , 2005, Biochemistry.

[25]  Maykel Pérez González,et al.  A topological substructural approach applied to the computational prediction of rodent carcinogenicity. , 2005, Bioorganic & medicinal chemistry.

[26]  Wen-Qi Huang,et al.  A Branch and Bound Algorithm for the Protein Folding Problem in the HP Lattice Model , 2005, Genomics, proteomics & bioinformatics.

[27]  Bo Liao,et al.  Graphical approach to analyzing DNA sequences , 2005, J. Comput. Chem..

[28]  Kuo-Chen Chou,et al.  Molecular modeling studies of peptide drug candidates against SARS. , 2006, Medicinal chemistry (Shariqah (United Arab Emirates)).

[29]  Humberto González Díaz,et al.  Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices , 2007, J. Comput. Chem..

[30]  Kuo-Chen Chou,et al.  Multiple field three dimensional quantitative structure–activity relationship (MF‐3D‐QSAR) , 2008, J. Comput. Chem..

[31]  Dong Xu,et al.  BSS-HMM3s: An improved HMM method for identifying transcription factor binding sites , 2005, DNA sequence : the journal of DNA sequencing and mapping.

[32]  K. Chou,et al.  Euk-mPLoc: a fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites. , 2007, Journal of proteome research.

[33]  Ernesto Estrada,et al.  Characterization of the folding degree of proteins , 2002, Bioinform..

[34]  Ernesto Estrada,et al.  In Silico Studies toward the Discovery of New Anti-HIV Nucleoside Compounds with the Use of TOPS-MODE and 2D/3D Connectivity Indices, 1. Pyrimidyl Derivatives , 2002, J. Chem. Inf. Comput. Sci..

[35]  Yu-hua Yao,et al.  A 2D graphical representation of RNA secondary structures and the analysis of similarity/dissimilarity based on it , 2005 .

[36]  K.-C. Chou,et al.  Virtual screening for finding natural inhibitor against cathepsin-L for SARS therapy , 2006, Amino Acids.

[37]  Jiangning Song,et al.  Prediction of cis/trans isomerization in proteins using PSI-BLAST profiles and secondary structure information , 2006, BMC Bioinformatics.

[38]  Tianming Wang,et al.  On a seven-dimensional representation of RNA secondary structures , 2005, International Conference of Computational Methods in Sciences and Engineering 2004 (ICCMSE 2004).

[39]  Subhash C. Basak,et al.  Simple Numerical Descriptor for Quantifying Effect of Toxic Substances on DNA Sequences , 2000, J. Chem. Inf. Comput. Sci..

[40]  Eugenio Uriarte,et al.  Markovian Backbone Negentropies: Molecular descriptors for protein research. I. Predicting protein stability in Arc repressor mutants , 2004, Proteins.

[41]  Francisco Torrens,et al.  Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. , 2004, Molecules.

[42]  P. Dobson,et al.  Predicting enzyme class from protein structure without alignments. , 2005, Journal of molecular biology.

[43]  Kuo-Chen Chou,et al.  Prediction of enzyme family classes. , 2003, Journal of proteome research.

[44]  C. J. Zheng,et al.  Prediction of Functional Class of Novel Bacterial Proteins without the Use of Sequence Similarity by a Statistical Learning Method , 2005, Journal of Molecular Microbiology and Biotechnology.

[45]  C. Dobson,et al.  Rationalization of the effects of mutations on peptide andprotein aggregation rates , 2003, Nature.

[46]  G. Barber,et al.  The dsRNA binding protein family: critical roles, diverse cellular functions , 2003, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[47]  Thomas L. Madden,et al.  PowerBLAST: a new network BLAST application for interactive or automated sequence analysis and annotation. , 1997, Genome research.

[48]  Milan Randic,et al.  Algorithm for Coding DNA Sequences into "Spectrum-like" and "Zigzag" Representations , 2005, J. Chem. Inf. Model..

[49]  Yu-Hua Yao,et al.  A class of 2D graphical representations of RNA secondary structures and the analysis of similarity based on them , 2005, J. Comput. Chem..

[50]  X.-D. Sun,et al.  Prediction of protein structural classes using support vector machines , 2006, Amino Acids.

[51]  M. Randic,et al.  2-D Graphical representation of proteins based on virtual genetic code , 2004, SAR and QSAR in environmental research.

[52]  J. Chou,et al.  Predicting cleavability of peptide sequences by HIV protease via correlation-angle approach , 1993, Journal of protein chemistry.

[53]  Ying-Li Chen,et al.  Prediction of the subcellular location of apoptosis proteins. , 2007, Journal of theoretical biology.

[54]  K. Chou Prediction of protein cellular attributes using pseudo‐amino acid composition , 2001, Proteins.

[55]  Rafael Molina,et al.  Stochastic molecular descriptors for polymers. 2. Spherical truncation of electrostatic interactions , 2005 .

[56]  Christopher J. Lee,et al.  Multiple sequence alignment using partial order graphs , 2002, Bioinform..

[57]  Kuo-Chen Chou,et al.  Using stacked generalization to predict membrane protein types based on pseudo-amino acid composition. , 2006, Journal of theoretical biology.

[58]  Kuo-Chen Chou,et al.  Signal-CF: a subsite-coupled and window-fusing approach for predicting signal peptides. , 2007, Biochemical and biophysical research communications.

[59]  Kuo-Chen Chou,et al.  Binding mechanism of coronavirus main proteinase with ligands and its implication to drug design against SARS , 2003, Biochemical and Biophysical Research Communications.

[60]  M. Delorenzi,et al.  An HMM model for coiled-coil domains and a comparison with PSSM-based predictions , 2002, Bioinform..

[61]  A W Nicholson,et al.  Mutational analysis of a ribonuclease III processing signal. , 1993, Biochemistry.

[62]  Humberto González-Díaz,et al.  2D RNA-QSAR: assigning ACC oxidase family membership with stochastic molecular descriptors; isolation and prediction of a sequence from Psidium guajava L. , 2005, Bioorganic & medicinal chemistry letters.

[63]  Maykel Cruz-Monteagudo,et al.  QSAR for anti-RNA-virus activity, synthesis, and assay of anti-RSV carbonucleosides given a unified representation of spectral moments, quadratic, and topologic indices. , 2005, Bioorganic & medicinal chemistry letters.

[64]  Zhikang Qian,et al.  Expression and purification of the carboxyl terminus domain of Schizosaccharomyces pombe dicer in Escherichia coli. , 2005, Protein and peptide letters.

[65]  Kuo-Chen Chou,et al.  Peptide reagent design based on physical and chemical properties of amino acid residues , 2007, J. Comput. Chem..

[66]  M. Wigler,et al.  A gene from S. pombe with homology to E. coli RNAse III blocks conjugation and sporulation when overexpressed in wild type cells. , 1990, Nucleic acids research.

[67]  K. Chou,et al.  Virus-PLoc: a fusion classifier for predicting the subcellular localization of viral proteins within host and virus-infected cells. , 2007, Biopolymers.

[68]  S.-W. Zhang,et al.  Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion , 2006, Amino Acids.

[69]  E. Wagner,et al.  Mechanism of killer gene activation. Antisense RNA-dependent RNase III cleavage ensures rapid turn-over of the stable hok, srnB and pndA effector messenger RNAs. , 1992, Journal of molecular biology.

[70]  Kuo-Chen Chou,et al.  Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-Nearest Neighbor classifiers. , 2006, Journal of proteome research.

[71]  K. Chou,et al.  Prediction of beta-turns. , 1979, Journal of protein chemistry.

[72]  Kuo-Chen Chou,et al.  Predicting the affinity of epitope-peptides with class I MHC molecule HLA-A*0201: an application of amino acid-based peptide prediction. , 2007, Protein engineering, design & selection : PEDS.

[73]  Hugh D. Robertson,et al.  Escherichia coli ribonuclease III cleavage sites , 1982, Cell.

[74]  Frank Thomson Leighton,et al.  Protein folding in the hydrophobic-hydrophilic (HP) is NP-complete , 1998, RECOMB '98.

[75]  Kuo-Chen Chou,et al.  Inhibitor design for SARS coronavirus main protease based on "distorted key theory". , 2007, Medicinal chemistry (Shariqah (United Arab Emirates)).

[76]  Hao Lin,et al.  Predicting conotoxin superfamily and family by using pseudo amino acid composition and modified Mahalanobis discriminant. , 2007, Biochemical and biophysical research communications.

[77]  R. Shiekhattar,et al.  MicroRNA biogenesis: isolation and characterization of the microprocessor complex. , 2006, Methods in molecular biology.

[78]  Sukanta Mondal,et al.  Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification. , 2006, Journal of theoretical biology.

[79]  Cyrus Chothia,et al.  SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments , 2002, Nucleic Acids Res..

[80]  E Estrada On the Topological Sub-Structural Molecular Design (TOSS-MODE) in QSPR/QSAR and Drug Design Research , 2000, SAR and QSAR in environmental research.

[81]  Kuo-Chen Chou,et al.  Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes , 2005, Bioinform..

[82]  Jean Garnier,et al.  FORESST: fold recognition from secondary structure predictions of proteins , 1999, Bioinform..

[83]  K. Chou,et al.  Hum-mPLoc: an ensemble classifier for large-scale human protein subcellular location prediction by incorporating samples with multiple sites. , 2007, Biochemical and biophysical research communications.

[84]  Humberto González Díaz,et al.  Stochastic molecular descriptors for polymers. 1. Modelling the properties of icosahedral viruses with 3D-Markovian negentropies , 2004 .

[85]  V. Pekarik,et al.  Design of shRNAs for RNAi—A lesson from pre-miRNA processing: Possible clinical applications , 2005, Brain Research Bulletin.

[86]  Francisco Torrens,et al.  Atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic. , 2005, Bioorganic & medicinal chemistry.

[87]  Maykel Pérez González,et al.  A topological function based on spectral moments for predicting affinity toward A3 adenosine receptors. , 2006, Bioorganic & medicinal chemistry letters.

[88]  Ronal Ramos de Armas,et al.  Vibrational Markovian modelling of footprints after the interaction of antibiotics with the packaging region of HIV type 1 , 2003, Bulletin of mathematical biology.

[89]  D. Frendewey,et al.  Substrate structure requirements of the Pac1 ribonuclease from Schizosaccharmyces pombe. , 1997, RNA.

[90]  K C Chou,et al.  Prediction of tight turns and their types in proteins. , 2000, Analytical biochemistry.

[91]  K.-C. Chou,et al.  Using cellular automata to generate image representation for biological sequences , 2005, Amino Acids.

[92]  Kuo-Chen Chou,et al.  Ensemble classifier for protein fold pattern recognition , 2006, Bioinform..

[93]  Judith Potashkin A mutation ina single gene ofSchizosaccharomyces pombeaffects theexpression ofseveral snRNAsand causesdefects inRNA processing , 1990 .

[94]  Yan Zhou,et al.  UniBLAST: a system to filter, cluster, and display BLAST results and assign unique gene annotation , 2002, Bioinform..

[95]  Tian-ming Wang,et al.  A 3D Graphical Representation of RNA Secondary Structures , 2004, Journal of biomolecular structure & dynamics.

[96]  Jishou Ruan,et al.  Novel scales based on hydrophobicity indices for secondary protein structure. , 2007, Journal of theoretical biology.

[97]  K.-C. Chou,et al.  Anti-SARS drug screening by molecular docking , 2006, Amino Acids.

[98]  Kuo-Chen Chou,et al.  Heuristic molecular lipophilicity potential (HMLP): A 2D‐QSAR study to LADH of molecular family pyrazole and derivatives , 2005, J. Comput. Chem..

[99]  H D Robertson,et al.  Purification and properties of ribonuclease III from Escherichia coli. , 1968, The Journal of biological chemistry.

[100]  Q. Wei,et al.  Analysis of short interfering RNA function in RNA interference by using Drosophila embryo extracts and schneider cells. , 2005, Methods in enzymology.

[101]  Humberto González-Díaz,et al.  Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments , 2005, Biopolymers.

[102]  J. Hurwitz,et al.  Isolation and purification of double-stranded ribonuclease from calf thymus. , 1977, The Journal of biological chemistry.

[103]  R. Young,et al.  Complementary sequences 1700 nucleotides apart form a ribonuclease III cleavage site in Escherichia coli ribosomal precursor RNA. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[104]  Ján Manuch,et al.  Structure-Approximating Inverse Protein Folding Problem in the 2D HP Model , 2005, J. Comput. Biol..

[105]  Humberto González Díaz,et al.  What Are the Limits of Applicability for Graph Theoretic Descriptors in QSPR/QSAR? Modeling Dipole Moments of Aromatic Compounds with TOPS-MODE Descriptors , 2003, J. Chem. Inf. Comput. Sci..

[106]  Milan Randic,et al.  On 3-D Graphical Representation of DNA Primary Sequences and Their Numerical Characterization , 2000, J. Chem. Inf. Comput. Sci..

[107]  K. Chou,et al.  Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. , 2006, Biochemical and biophysical research communications.

[108]  G. Li,et al.  Classifying G protein-coupled receptors and nuclear receptors on the basis of protein power spectrum from fast Fourier transform , 2006, Amino Acids.

[109]  M. Borodovsky,et al.  GeneMark.hmm: new solutions for gene finding. , 1998, Nucleic acids research.

[110]  Z. Huang,et al.  Using cellular automata images and pseudo amino acid composition to predict protein subcellular location , 2005, Amino Acids.

[111]  Kuo-Chen Chou,et al.  Using pseudo amino acid composition to predict protein structural classes: Approached with complexity measure factor , 2006, J. Comput. Chem..

[112]  Bo Liao,et al.  Analysis of Similarity/Dissimilarity of DNA Sequences Based on Nonoverlapping Triplets of Nucleotide Bases , 2004, J. Chem. Inf. Model..

[113]  G. Koraimann,et al.  Expression of gene 19 of the conjugative plasmid R1 is controlled by RNase III , 1993, Molecular microbiology.

[114]  J. Chou,et al.  A formulation for correlating properties of peptides and its application to predicting human immunodeficiency virus protease‐cleavable sites in proteins , 1993, Biopolymers.

[115]  S. Vilar,et al.  Probabilistic neural network model for the in silico evaluation of anti-HIV activity and mechanism of action. , 2006, Journal of medicinal chemistry.

[116]  Francisco Torrens,et al.  Atom, atom-type, and total nonstochastic and stochastic quadratic fingerprints: a promising approach for modeling of antibacterial activity. , 2005, Bioorganic & medicinal chemistry.

[117]  Kuo-Chen Chou,et al.  MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. , 2007, Biochemical and biophysical research communications.

[118]  Miguel A. Cabrera,et al.  Unified Markov thermodynamics based on stochastic forms to classify drugs considering molecular structure, partition system, and biological species: distribution of the antimicrobial G1 on rat tissues. , 2005, Bioorganic & medicinal chemistry letters.

[119]  Wen Zhu,et al.  A condensed 3D graphical representation of RNA secondary structures , 2005 .

[120]  Michele Vendruscolo,et al.  Prediction of "aggregation-prone" and "aggregation-susceptible" regions in proteins associated with neurodegenerative diseases. , 2005, Journal of molecular biology.

[121]  Xiaoyong Zou,et al.  Using pseudo-amino acid composition and support vector machine to predict protein structural class. , 2006, Journal of theoretical biology.

[122]  K. Chou,et al.  A study on the correlation of G-protein-coupled receptor types with amino acid composition. , 2002, Protein engineering.

[123]  Kequan Ding,et al.  On A Six-Dimensional Representation of RNA Secondary Structures , 2005, Journal of biomolecular structure & dynamics.

[124]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[125]  Milan Randic,et al.  On A Four-Dimensional Representation of DNA Primary Sequences , 2003, J. Chem. Inf. Comput. Sci..

[126]  An-Suei Yang,et al.  Structure-dependent sequence alignment for remotely related proteins , 2002, Bioinform..

[127]  Lourdes Santana,et al.  Stochastic entropy QSAR for the in silico discovery of anticancer compounds: prediction, synthesis, and in vitro assay of new purine carbanucleosides. , 2006, Bioorganic & medicinal chemistry.

[128]  Renfa Li,et al.  RNA secondary structure 2D graphical representation without degeneracy , 2006 .

[129]  A. Krogh,et al.  Using database matches with for HMMGene for automated gene detection in Drosophila. , 2000, Genome research.

[130]  K. Chou Structural bioinformatics and its impact to biomedical science. , 2004, Current medicinal chemistry.

[131]  Günther Zehetner,et al.  OntoBlast function: from sequence similarities directly to potential functional annotations by ontology terms , 2003, Nucleic Acids Res..

[132]  Xuyu Xiang,et al.  Coronavirus phylogeny based on 2D graphical representation of DNA sequence , 2006, J. Comput. Chem..

[133]  Howard Leung,et al.  Prediction of membrane protein types from sequences and position-specific scoring matrices. , 2007, Journal of theoretical biology.

[134]  Marc Gillespie,et al.  Rescue of the fission yeast snRNA synthesis mutant snm1 by overexpression of the double-strand-specific Pac1 ribonuclease , 1995, Molecular and General Genetics MGG.

[135]  Kuo-Chen Chou,et al.  Application of bioinformatics in search for cleavable peptides of SARS-CoV M(pro) and chemical modification of octapeptides. , 2005, Medicinal chemistry (Shariqah (United Arab Emirates)).

[136]  Kuo-Chen Chou,et al.  Large-scale predictions of gram-negative bacterial protein subcellular locations. , 2006, Journal of proteome research.

[137]  Lourdes Santana,et al.  A model for the recognition of protein kinases based on the entropy of 3D van der Waals interactions. , 2007, Journal of proteome research.

[138]  D. Frendewey,et al.  Purification and characterization of the Pac1 ribonuclease of Schizosaccharomyces pombe. , 1996, Nucleic acids research.

[139]  A W Nicholson,et al.  Structure, reactivity, and biology of double-stranded RNA. , 1996, Progress in nucleic acid research and molecular biology.

[140]  Rajeev K. Azad,et al.  Probabilistic methods of identifying genes in prokaryotic genomes: Connections to the HMM theory , 2004, Briefings Bioinform..

[141]  Rolf Apweiler,et al.  InterProScan - an integration platform for the signature-recognition methods in InterPro , 2001, Bioinform..

[142]  F. Studier,et al.  T7 early RNAs and Escherichia coli ribosomal RNAs are cut from large precursor RNAs in vivo by ribonuclease 3. , 1973, Proceedings of the National Academy of Sciences of the United States of America.

[143]  K. Chou Prediction of signal peptides using scaled window , 2001, Peptides.

[144]  George E. Sandusky,et al.  Dicer Is Required for Embryonic Angiogenesis during Mouse Development* , 2005, Journal of Biological Chemistry.

[145]  A Nandy Recent investigations into global characteristics of long DNA sequences. , 1994, Indian journal of biochemistry & biophysics.

[146]  Lourdes Santana,et al.  Medicinal chemistry and bioinformatics--current trends in drugs discovery with networks topological indices. , 2007, Current topics in medicinal chemistry.

[147]  Y. Z. Chen,et al.  Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach , 2004, Nucleic acids research.

[148]  Maykel Cruz-Monteagudo,et al.  Computational chemistry development of a unified free energy Markov model for the distribution of 1300 chemicals to 38 different environmental or biological systems , 2007, J. Comput. Chem..

[149]  Han van de Waterbeemd,et al.  Chemometric methods in molecular design , 1995 .

[150]  Lin He,et al.  Application of Pseudo Amino Acid Composition for Predicting Protein Subcellular Location: Stochastic Signal Processing Approach , 2003, Journal of protein chemistry.

[151]  Ernesto Estrada,et al.  In Silico Studies Toward the Discovery of New anti‐HIV Nucleoside Compounds with the Use of TOPS‐MODE and 2D/3D Connectivity Indices. Part 1. Pyrimidyl Derivatives. , 2005 .

[152]  Y.Z. Chen,et al.  Prediction of functional class of novel viral proteins by a statistical learning method irrespective of sequence similarity , 2004, Virology.

[153]  Donald Court,et al.  5 – RNA Processing and Degradation by RNase III , 1993 .

[154]  Peixiang Cai,et al.  Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. , 2006, Analytical biochemistry.

[155]  K. Chou,et al.  Progress in computational approach to drug development against SARS. , 2006, Current medicinal chemistry.

[156]  Saul G. Jacchieri,et al.  Mining combinatorial data in protein sequences and structures , 2004, Molecular Diversity.

[157]  Yu Zong Chen,et al.  Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. , 2004, RNA.

[158]  Z. Huang,et al.  Using pseudo amino acid composition to predict protein subcellular location: Approached with Lyapunov index, Bessel function, and Chebyshev filter , 2005, Amino Acids.

[159]  Maykel Pérez González,et al.  A TOPS-MODE approach to predict affinity for A1 adenosine receptors. 2-(Arylamino)adenosine analogues. , 2004, Bioorganic & medicinal chemistry.

[160]  Humberto González-Díaz,et al.  Predicting stability of Arc repressor mutants with protein stochastic moments. , 2005, Bioorganic & medicinal chemistry.

[161]  Yanda Li,et al.  Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence , 2006, BMC Bioinformatics.

[162]  M. Randic,et al.  Highly compact 2D graphical representation of DNA sequences , 2004, SAR and QSAR in environmental research.

[163]  T. Tuschl,et al.  Identification of Novel Argonaute-Associated Proteins , 2005, Current Biology.

[164]  Y. Z. Chen,et al.  Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity Published, JLR Papers in Press, January 27, 2006. , 2006, Journal of Lipid Research.

[165]  Jingchu Luo,et al.  Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT , 2003, Mammalian Genome.

[166]  Ernesto Estrada,et al.  Effect of protein backbone folding on the stability of protein-ligand complexes. , 2006, Journal of proteome research.

[167]  Humberto González Díaz,et al.  Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments , 2007, J. Comput. Chem..

[168]  Johannes Söding,et al.  Protein homology detection by HMM?CHMM comparison , 2005, Bioinform..

[169]  Bo Liao,et al.  New 2D graphical representation of DNA sequences , 2004, J. Comput. Chem..

[170]  Sherif Abou Elela,et al.  Evaluation of the RNA Determinants for Bacterial and Yeast RNase III Binding and Cleavage* , 2004, Journal of Biological Chemistry.

[171]  Maykel Pérez González,et al.  Quantitative structure-activity relationship to predict toxicological properties of benzene derivative compounds. , 2005, Bioorganic & medicinal chemistry.

[172]  Humberto González Díaz,et al.  2D‐RNA‐coupling numbers: A new computational chemistry approach to link secondary structure topology with biological function , 2007, J. Comput. Chem..

[173]  Lourdes Santana,et al.  A QSAR model for in silico screening of MAO-A inhibitors. Prediction, synthesis, and biological assay of novel coumarins. , 2006, Journal of medicinal chemistry.

[174]  Juan Cui,et al.  Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity , 2006, Proteomics.

[175]  Z. Wen,et al.  Delaunay triangulation with partial least squares projection to latent structures: a model for G-protein coupled receptors classification and fast structure recognition , 2007, Amino Acids.

[176]  Zheng Yuan Prediction of protein subcellular locations using Markov chain models , 1999, FEBS letters.

[177]  M. Yamamoto,et al.  S. pombe pac1+, whose overexpression inhibits sexual development, encodes a ribonuclease III‐like RNase. , 1991, The EMBO journal.

[178]  S. Hall,et al.  Localisation of an endonuclease specific for double-stranded RNA within the nucleolus and its implication in processing ribosomal transcripts. , 1979, European journal of biochemistry.

[179]  K. Chou Prediction and classification of α‐turn types , 1997 .

[180]  Ying-Li Chen,et al.  Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition. , 2007, Journal of theoretical biology.

[181]  Humberto González-Díaz,et al.  Novel 2D maps and coupling numbers for protein sequences. The first QSAR study of polygalacturonases; isolation and prediction of a novel sequence from Psidium guajava L. , 2006, FEBS letters.

[182]  Ramón García-Domenech,et al.  New agents active against Mycobacterium avium complex selected by molecular topology: a virtual screening method. , 2003, The Journal of antimicrobial chemotherapy.

[183]  Yu-Dong Cai,et al.  Prediction of protein function in the absence of significant sequence similarity. , 2004, Current medicinal chemistry.

[184]  K. Chou,et al.  A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins. , 1993, The Journal of biological chemistry.

[185]  Jie Song A new 3-D graphical representation of DNA sequences and their numerical characterization , 2009, 2009 4th International Conference on Computer Science & Education.

[186]  E. Wagner,et al.  Control of replication of plasmid R1: the duplex between the antisense RNA, CopA, and its target, CopT, is processed specifically in vivo and in vitro by RNase III. , 1990, The EMBO journal.

[187]  José Ignacio Abreu Salas,et al.  Amino Acid Sequence Autocorrelation Vectors and Ensembles of Bayesian-Regularized Genetic Neural Networks for Prediction of Conformational Stability of Human Lysozyme Mutants , 2006, J. Chem. Inf. Model..

[188]  K. Chou,et al.  Prediction of linear B-cell epitopes using amino acid pair antigenicity scale , 2007, Amino Acids.

[189]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[190]  K C Chou Prediction and classification of alpha-turn types. , 1997, Biopolymers.

[191]  Francisco Torrens,et al.  3D-chiral quadratic indices of the 'molecular pseudograph's atom adjacency matrix' and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. , 2004, Bioorganic & medicinal chemistry.

[192]  Eugenio Uriarte,et al.  Stochastic-based descriptors studying peptides biological properties: modeling the bitter tasting threshold of dipeptides. , 2004, Bioorganic & medicinal chemistry.

[193]  A. K. Md. Ehsanes Saleh,et al.  Multiple Regression Model , 2005 .

[194]  Maykel Pérez González,et al.  A TOPS-MODE approach to predict adenosine kinase inhibition. , 2004, Bioorganic & medicinal chemistry letters.