Scrutinizing functional interaction networks from RNA-binding proteins to their targets in cancer

RNA-binding proteins (RBPs) participate in all stages of RNA life cycle from transcription, splicing, to translation. Under the ENCODE project, a large number of RBPs were knocked down in human cancer cell lines, offering an excellent opportunity to infer targets of RBPs. Taking both RBP binding sites and RNA-seq profiles of RBP knockdown samples as input, we present a pipeline to identify causal RBP RNA interactions. The pipeline employs a recent functional chi-square test (FunChisq) that deciphers directional association, and utilizes a novel functional index that measures the effect size of functional dependency. We examined $\sim 45$ million RBP RNA pairs in leukemia (K562) and liver cancer (HepG2) cell lines for functional patterns as causal interaction candidates. Here, we report a total of 936,707 RBP RNA pairs in the two cell lines that show statistically significant linear or nonlinear functional patterns. About 31% of these pairs have supportive biological evidence from other sources, suggesting the effectiveness of the pipeline. The interactions constitute RBP specific regulatory networks that may potentially represent core mechanisms in the two cancers. The pipeline is implemented through an R interface with pre-computed results and data libraries for users to query specific networks and visualize RBP RNA interactions. Such networks serve as a useful resource for studying RNA dysregulation in cancer.

[1]  Gail M. Sullivan,et al.  Using Effect Size-or Why the P Value Is Not Enough. , 2012, Journal of graduate medical education.

[2]  Mingzhou Song,et al.  A Fast Exact Functional Test for Directional Association and Cancer Biology Applications , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[3]  H. Simon,et al.  Cause and Counterfactual , 1966 .

[4]  P. J. van der Spek,et al.  University of Groningen Gene Expression and Functional Annotation of the Human and Mouse Choroid Plexus Epithelium , 2013 .

[5]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[6]  Haizhou Wang,et al.  Ckmeans.1d.dp: Optimal k-means Clustering in One Dimension by Dynamic Programming , 2011, R J..

[7]  Thomas Tuschl,et al.  Identification of RNA–protein interaction networks using PAR‐CLIP , 2012, Wiley interdisciplinary reviews. RNA.

[8]  Martin Renqiang Min,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[9]  S. Lemon,et al.  DDX6 (Rck/p54) Is Required for Efficient Hepatitis C Virus Replication but Not for Internal Ribosome Entry Site-Directed Translation , 2010, Journal of Virology.

[10]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[11]  Robert B Darnell,et al.  HITS‐CLIP: panoramic views of protein–RNA regulation in living cells , 2010, Wiley interdisciplinary reviews. RNA.

[12]  W. Xiong,et al.  Knockdown of p54nrb inhibits migration, invasion and TNF-α release of human acute monocytic leukemia THP1 cells. , 2016, Oncology reports.

[13]  Gary D. Bader,et al.  Pathway Commons, a web resource for biological pathway data , 2010, Nucleic Acids Res..

[14]  Xihong Lin,et al.  Novel structural co-expression analysis linking the NPM1-associated ribosomal biogenesis network to chronic myelogenous leukemia , 2015, Scientific Reports.

[15]  Y. Chrétien,et al.  Mitogenic insulin receptor-A is overexpressed in human hepatocellular carcinoma due to EGFR-mediated dysregulation of RNA splicing factors. , 2013, Cancer research.

[16]  C. Burge,et al.  Interactome analysis brings splicing into focus , 2015, Genome Biology.

[17]  H. Choy,et al.  hnRNP M facilitates exon 7 inclusion of SMN2 pre-mRNA in spinal muscular atrophy by targeting an enhancer on exon 7. , 2014, Biochimica et biophysica acta.

[18]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[19]  M. L. Hastings,et al.  Targeting SR Proteins Improves SMN Expression in Spinal Muscular Atrophy Cells , 2014, PloS one.

[20]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[21]  Shelly C. Lu,et al.  Proteomic analysis of human hepatoma cells expressing methionine adenosyltransferase I/III: Characterization of DDX3X as a target of S-adenosylmethionine. , 2012, Journal of proteomics.

[22]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[23]  C. Freitag,et al.  Protein signatures of oxidative stress response in a patient specific cell line model for autism , 2014, Molecular Autism.

[24]  Mingzhou Song,et al.  Deciphering Interactions in Causal Networks without Parametric Assumptions , 2013, 1311.2707.

[25]  T. Glisovic,et al.  RNA‐binding proteins and post‐transcriptional gene regulation , 2008, FEBS letters.

[26]  Minoru Kanehisa,et al.  KEGG as a reference resource for gene and protein annotation , 2015, Nucleic Acids Res..

[27]  Minoru Kanehisa,et al.  KEGG: new perspectives on genomes, pathways, diseases and drugs , 2016, Nucleic Acids Res..

[28]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[29]  J. Ule,et al.  Protein–RNA interactions: new genomic technologies and perspectives , 2012, Nature Reviews Genetics.

[30]  Gene W. Yeo,et al.  Robust transcriptome-wide discovery of RNA binding protein binding sites with enhanced CLIP (eCLIP) , 2016, Nature Methods.

[31]  S. Yohe Molecular Genetic Markers in Acute Myeloid Leukemia , 2015, Journal of clinical medicine.

[32]  K. Pearson On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated System of Variables is Such that it Can be Reasonably Supposed to have Arisen from Random Sampling , 1900 .