Classification of Riboswitch Families Using Block Location-Based Feature Extraction (BLBFE) Method

Purpose: Riboswitches are special non-coding sequences usually located in mRNAs’ un-translated regions and regulate gene expression and consequently cellular function. Furthermore, their interaction with antibiotics has been recently implicated. This raises more interest in development of bioinformatics tools for riboswitch studies. Herein, we describe the development and employment of novel block location-based feature extraction (BLBFE) method for classification of riboswitches. Methods: We have already developed and reported a sequential block finding (SBF) algorithm which, without operating alignment methods, identifies family specific sequential blocks for riboswitch families. Herein, we employed this algorithm for 7 riboswitch families including lysine, cobalamin, glycine, SAM-alpha, SAM-IV, cyclic-di-GMP-I and SAH. Then the study was extended toward implementation of BLBFE method for feature extraction. The outcome features were applied in various classifiers including linear discriminant analysis (LDA), probabilistic neural network (PNN), decision tree and k-nearest neighbors (KNN) classifiers for classification of the riboswitch families. The performance of the classifiers was investigated according to performance measures such as correct classification rate (CCR), accuracy, sensitivity, specificity and f-score. Results: As a result, average CCR for classification of riboswitches was 87.87%. Furthermore, application of BLBFE method in 4 classifiers displayed average accuracies of 93.98% to 96.1%, average sensitivities of 76.76% to 83.61%, average specificities of 96.53% to 97.69% and average f-scores of 74.9% to 81.91%. Conclusion: Our results approved that the proposed method of feature extraction; i.e. BLBFE method; can be successfully used for classification and discrimination of the riboswitch families with high CCR, accuracy, sensitivity, specificity and f-score values.

[1]  Ali Nahvi,et al.  An mRNA structure that controls gene expression by binding S-adenosylmethionine , 2003, Nature Structural Biology.

[2]  Scott A Strobel,et al.  Chemical basis of glycine riboswitch cooperativity. , 2007, RNA.

[3]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[4]  Robert D. Finn,et al.  Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families , 2017, Nucleic Acids Res..

[5]  R. Breaker,et al.  A variant riboswitch aptamer class for S-adenosylmethionine common in marine bacteria. , 2009, RNA.

[6]  T. Subba Rao,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB , 2004 .

[7]  P. P. Vaidyanathan,et al.  Structural Alignment of RNAs Using Profile-csHMMs and Its Application to RNA Homology Search: Overview and New Results , 2008, IEEE Transactions on Automatic Control.

[8]  Mark S Dunstan,et al.  Modular riboswitch toolsets for synthetic genetic control in diverse bacterial species. , 2014, Journal of the American Chemical Society.

[9]  Zasha Weinberg,et al.  The aptamer core of SAM-IV riboswitches mimics the ligand-binding site of SAM-I riboswitches. , 2008, RNA.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  Junjie Chen,et al.  Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences , 2015, Nucleic Acids Res..

[12]  Shane J. Neph,et al.  Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline , 2007, Nucleic acids research.

[13]  Bin Liu,et al.  Pse-in-One 2.0: An Improved Package of Web Servers for Generating Various Modes of Pseudo Components of DNA, RNA, and Protein Sequences , 2017 .

[14]  Ronald R. Breaker,et al.  Thiamine derivatives bind messenger RNAs directly to regulate bacterial gene expression , 2002, Nature.

[15]  Pradipta Bandyopadhyay,et al.  Riboswitch Detection Using Profile Hidden Markov Models , 2009, BMC Bioinformatics.

[16]  D. Haussler,et al.  A hidden Markov model that finds genes in E. coli DNA. , 1994, Nucleic acids research.

[17]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[18]  R. Breaker,et al.  Gene regulation by riboswitches , 2004, Nature Reviews Molecular Cell Biology.

[19]  Zasha Weinberg,et al.  A Glycine-Dependent Riboswitch That Uses Cooperative Binding to Control Gene Expression , 2004, Science.

[20]  Byung-Jun Yoon,et al.  HMM with auxiliary memory: a new tool for modeling RNA structures , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[21]  R. Breaker,et al.  Control of gene expression by a natural metabolite-responsive ribozyme , 2004, Nature.

[22]  Ronald R. Breaker,et al.  Roseoflavin is a natural antibacterial compound that binds to FMN riboswitches and regulates gene expression , 2009, RNA biology.

[23]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[24]  M. Sedaaghi,et al.  Development of a new sequential block finding strategy for detection of conserved sequences in riboswitches , 2017, BioImpacts : BI.

[25]  R. Breaker,et al.  Antibacterial lysine analogs that target lysine riboswitches. , 2007, Nature chemical biology.

[26]  Ali Nahvi,et al.  Genetic control by a metabolite binding mRNA. , 2002, Chemistry & biology.

[27]  R. Breaker,et al.  Riboswitches that sense S-adenosylhomocysteine and activate genes involved in coenzyme recycling. , 2008, Molecular cell.

[28]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[29]  R. Breaker,et al.  Thiamine pyrophosphate riboswitches are targets for the antimicrobial compound pyrithiamine. , 2005, Chemistry & biology.

[30]  R. Breaker,et al.  Riboswitches in Eubacteria Sense the Second Messenger Cyclic Di-GMP , 2008, Science.

[31]  Charles Rattray,et al.  Themes and variations , 2007, Architectural Research Quarterly.

[32]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[33]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[34]  M. Gelfand,et al.  Regulation of lysine biosynthesis and transport genes in bacteria: yet another RNA riboswitch? , 2003, Nucleic acids research.

[35]  Mijeong Kang,et al.  Structural Insights into riboswitch control of the biosynthesis of queuosine, a modified nucleotide found in the anticodon of tRNA. , 2009, Molecular cell.

[36]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[37]  M. Mack,et al.  The RFN riboswitch of Bacillus subtilis is a target for the antibiotic roseoflavin produced by Streptomyces davawensis , 2009, RNA biology.

[38]  Sean R. Eddy,et al.  Rfam: annotating non-coding RNAs in complete genomes , 2004, Nucleic Acids Res..

[39]  Margaret S. Ebert,et al.  An mRNA structure in bacteria that controls gene expression by binding lysine. , 2003, Genes & development.

[40]  Andrea L Edwards,et al.  Structural basis for recognition of S-adenosylhomocysteine by riboswitches. , 2010, RNA.

[41]  Jing-Dong Ye,et al.  An energetically beneficial leader-linker interaction abolishes ligand-binding cooperativity in glycine riboswitches. , 2012, RNA.

[42]  Jeffrey E. Barrick,et al.  The distributions, mechanisms, and structures of metabolite-binding riboswitches , 2007, Genome Biology.

[43]  A. Serganov,et al.  Coenzyme recognition and gene regulation by a flavin mononucleotide riboswitch , 2009, Nature.

[44]  A. Barzegar,et al.  Evolutionary Origin and Conserved Structural Building Blocks of Riboswitches and Ribosomal RNAs: Riboswitches as Probable Target Sites for Aminoglycosides Interaction. , 2014, Advanced pharmaceutical bulletin.

[45]  Bangjun Lei,et al.  Classification, Parameter Estimation and State Estimation: An Engineering Approach Using MATLAB, 2nd Edition , 2017 .

[46]  Jeffrey S. Thompson,et al.  A new approach for detecting riboswitches in DNA sequences , 2014, Bioinform..

[47]  A. Serganov,et al.  Themes and variations in riboswitch structure and function. , 2014, Biochimica et biophysica acta.

[48]  L. Breiman,et al.  Submodel selection and evaluation in regression. The X-random case , 1992 .

[49]  A. Barzegar,et al.  Riboswitches as Potential Targets for Aminoglycosides Compared with rRNA Molecules: In Silico Study , 2015 .

[50]  Adam Roth,et al.  A riboswitch selective for the queuosine precursor preQ1 contains an unusually small aptamer domain , 2007, Nature Structural &Molecular Biology.

[51]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[52]  Swadha Singh,et al.  Application of supervised machine learning algorithms for the classification of regulatory RNA riboswitches , 2016, Briefings in functional genomics.

[53]  Kathryn D. Smith,et al.  Structural basis of ligand binding by a c-di-GMP riboswitch , 2009, Nature Structural &Molecular Biology.

[54]  R. Breaker,et al.  Riboswitches as antibacterial drug targets , 2006, Nature Biotechnology.

[55]  S. Salzberg,et al.  Microbial gene identification using interpolated Markov models. , 1998, Nucleic acids research.

[56]  Jeffrey E. Barrick,et al.  Evidence for a second class of S-adenosylmethionine riboswitches and other regulatory RNA motifs in alpha-proteobacteria , 2005, Genome Biology.

[57]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[58]  R. Breaker,et al.  Comparative genomics reveals 104 candidate structured RNAs from bacteria, archaea, and their metagenomes , 2010, Genome Biology.

[59]  A. Ferré-D’Amaré,et al.  Rapid RNA–ligand interaction analysis through high-information content conformational and stability landscapes , 2015, Nature Communications.

[60]  R. Durbin,et al.  RNA sequence analysis using covariance models. , 1994, Nucleic acids research.

[61]  A. Serganov,et al.  Structural insights into amino acid binding and gene control by a lysine riboswitch , 2008, Nature.