论文信息 - PEA: an integrated R toolkit for plant epitranscriptome analysis

PEA: an integrated R toolkit for plant epitranscriptome analysis

Motivation The epitranscriptome, also known as chemical modifications of RNA (CMRs), is a newly discovered layer of gene regulation, the biological importance of which emerged through analysis of only a small fraction of CMRs detected by high-throughput sequencing technologies. Understanding of the epitranscriptome is hampered by the absence of computational tools for the systematic analysis of epitranscriptome sequencing data. In addition, no tools have yet been designed for accurate prediction of CMRs in plants, or to extend epitranscriptome analysis from a fraction of the transcriptome to its entirety. Results Here, we introduce PEA, an integrated R toolkit to facilitate the analysis of plant epitranscriptome data. The PEA toolkit contains a comprehensive collection of functions required for read mapping, CMR calling, motif scanning and discovery, and gene functional enrichment analysis. PEA also takes advantage of machine learning technologies for transcriptome-scale CMR prediction, with high prediction accuracy, using the Positive Samples Only Learning algorithm, which addresses the two-class classification problem by using only positive samples (CMRs), in the absence of negative samples (non-CMRs). Hence PEA is a versatile epitranscriptome analysis pipeline covering CMR calling, prediction, and annotation, and we describe its application to predict N6-methyladenosine (m6A) modifications in Arabidopsis thaliana. Experimental results demonstrate that the toolkit achieved 71.6% sensitivity and 73.7% specificity, which is superior to existing m6A predictors. PEA is potentially broadly applicable to the in-depth study of epitranscriptomics. Availability PEA is implemented using R and available at https://github.com/cma2015/PEA.

Chuang Ma | Qian Cheng | Jingjing Zhai | Jie Song | Yunjia Tang

[1] Xiangfeng Wang,et al. Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W] , 2014, Plant Cell.

[2] K. Chou,et al. iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. , 2015, Analytical biochemistry.

[3] Wei Chen,et al. Identifying N6-methyladenosine sites in the Arabidopsis thaliana transcriptome , 2016, Molecular Genetics and Genomics.

[4] Chuan He,et al. Post-transcriptional gene regulation by mRNA modifications , 2016, Nature Reviews Molecular Cell Biology.

[5] Zhike Lu,et al. Unique Features of the m6A Methylome in Arabidopsis thaliana , 2014, Nature Communications.

[6] Yuri Motorin,et al. Detecting RNA modifications in the epitranscriptome: predict and validate , 2017, Nature Reviews Genetics.