motifNet: A Neural Network Approach for Learning Functional Sequence Patterns in mRNA

We present a new approach for predicting functional sequence patterns in mRNA, known as motifs. These motifs play an important role in understanding the mechanisms of the cell life cycle in clinical research and drug discovery. However, many existing neural network models for mRNA event prediction only take the sequence as input, and do not consider the positional information of the sequence. In contrast, motifNet is a lightweight neural network that uses both the sequence and its positional information as input. This allows for the implicit neural representation of the various motif interaction patterns in human mRNA sequences. The model can then be used to interactively generate motif patterns and the positional effect score in mRNA activities. Additionally, motifNet can identify violations of motif patterns in real human mRNA variants that are associated with disease-related cell dysfunction.

[1]  E. Miska,et al.  Alternative splicing modulation by G-quadruplexes , 2019, Nature Communications.

[2]  P. Nelson,et al.  Multiplexed functional genomic analysis of 5’ untranslated region mutations across the spectrum of prostate cancer , 2021, Nature Communications.

[3]  Alyssa La Fleur,et al.  Interpreting neural networks for biological sequences by learning stochastic masks , 2021, Nature Machine Intelligence.

[4]  Keisuke Nimura,et al.  Regulation of RNA Splicing: Aberrant Splicing Regulation and Therapeutic Targets in Cancer , 2021, Cells.

[5]  Q. Zhang,et al.  Predicting dynamic cellular protein–RNA interactions by deep learning using in vivo RNA structures , 2021, Cell Research.

[6]  Eli Shechtman,et al.  Spatially-Adaptive Pixelwise Networks for Fast Image Translation , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  M. Gymrek,et al.  Patterns of de novo tandem repeat mutations and their role in autism , 2020, Nature.

[8]  Jiajun Wu,et al.  pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Andreas Geiger,et al.  GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Matt Ploenzke,et al.  Improving representations of genomic sequence motifs in convolutional networks with exponential activations , 2020, Nature Machine Intelligence.

[11]  Avanti Shrikumar,et al.  Base-resolution models of transcription factor binding reveal soft motif syntax , 2019, Nature Genetics.

[12]  L. Romão,et al.  Translational Regulation by Upstream Open Reading Frames and Its Relevance to Human Genetic Disease , 2020 .

[13]  S. Scherer,et al.  Genome-wide detection of tandem DNA repeats that are expanded in autism , 2020, Nature.

[14]  Gordon Wetzstein,et al.  Implicit Neural Representations with Periodic Activation Functions , 2020, NeurIPS.

[15]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[16]  Andre Araujo,et al.  Computing Receptive Fields of Convolutional Neural Networks , 2019, Distill.

[17]  Gemma E. May,et al.  Impacts of uORF codon identity and position on translation regulation , 2019, Nucleic acids research.

[18]  K. Baek,et al.  Identification of GCC-box and TCC-box motifs in the promoters of differentially expressed genes in rice (Oryza sativa L.): Experimental and computational approaches , 2019, PloS one.

[19]  S. Clarke,et al.  Implicit representation of the auditory space: contribution of the left and right hemispheres , 2019, Brain Structure and Function.

[20]  David G. Knowles,et al.  Predicting Splicing from Primary Sequence with Deep Learning , 2019, Cell.

[21]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Kil To Chong,et al.  Branch Point Selection in RNA Splicing Using Deep Learning , 2019, IEEE Access.

[23]  Sebastien M. Weyn-Vanhentenryck,et al.  Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites , 2018, bioRxiv.

[24]  G. Seelig,et al.  Human 5′ UTR design and variant effect prediction from a massively parallel translation assay , 2018, bioRxiv.

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[27]  Nicolas P. Rougier,et al.  Implicit and explicit representations , 2009, Neural Networks.

[28]  W. Tang,et al.  A GC-rich sequence within the 5' untranslated region of human basonuclin mRNA inhibits its translation. , 1999, Gene.