MPRAnalyze: statistical framework for massively parallel reporter assays

Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences’ activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods.

[1]  Eran Segal,et al.  A shared architecture for promoters and enhancers , 2014, Nature Genetics.

[2]  Roland Eils,et al.  Complex heatmaps reveal patterns and correlations in multidimensional genomic data , 2016, Bioinform..

[3]  Jacob C. Ulirsch,et al.  Systematic Functional Dissection of Common Genetic Variation Affecting Red Blood Cell Traits , 2016, Cell.

[4]  Fabian J. Theis,et al.  MPRAnalyze: statistical framework for massively parallel reporter assays , 2019, bioRxiv.

[5]  Qian Li,et al.  Comparison of normalization approaches for gene expression studies completed with high-throughput sequencing , 2018, PloS one.

[6]  N. Ahituv,et al.  Decoding enhancers using massively parallel reporter assays. , 2015, Genomics.

[7]  Nadav Ahituv,et al.  Gene Regulatory Elements, Major Drivers of Human Disease. , 2017, Annual review of genomics and human genetics.

[8]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[9]  Vasily M. Studitsky,et al.  Distant Activation of Transcription: Mechanisms of Enhancer Action , 2012, Molecular and Cellular Biology.

[10]  Michael J. Ziller,et al.  Transcriptional and Epigenetic Dynamics during Specification of Human Embryonic Stem Cells , 2013, Cell.

[11]  Wen‐Teng Chang,et al.  A novel function of transcription factor alpha-Pal/NRF-1: increasing neurite outgrowth. , 2005, Biochemical and biophysical research communications.

[12]  Claudia Bank,et al.  A Statistical Guide to the Design of Deep Mutational Scanning Experiments , 2016, Genetics.

[13]  N. Jones,et al.  Loss of ATF2 Function Leads to Cranial Motoneuron Degeneration during Embryonic Mouse Development , 2011, PloS one.

[14]  Tsippi Iny Stein,et al.  The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses , 2016, Current protocols in bioinformatics.

[15]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[16]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[17]  T. Ohtsuka,et al.  Roles of Hes genes in neural development , 2008, Development, growth & differentiation.

[18]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[19]  S. Moody,et al.  Neural Transcription Factors: from Embryos to Neural Stem Cells , 2014, Molecules and cells.

[20]  Pardis C Sabeti,et al.  Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay , 2016, Cell.

[21]  Terence P. Speed,et al.  Enrich2: a statistical framework for analyzing deep mutational scanning data , 2016, bioRxiv.

[22]  Eric S. Lander,et al.  Direct Identification of Hundreds of Expression-Modulating Variants using a Multiplexed Reporter Assay , 2016, Cell.

[23]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[24]  Michael R. Green,et al.  Transcriptional regulatory elements in the human genome. , 2006, Annual review of genomics and human genetics.

[25]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[26]  Nir Yosef,et al.  Massively parallel characterization of regulatory dynamics during neural induction , 2018, bioRxiv.

[27]  Michael J. Ziller,et al.  Transcription factor binding dynamics during human ESC differentiation , 2015, Nature.

[28]  S. Fields,et al.  Deep mutational scanning: a new style of protein science , 2014, Nature Methods.

[29]  J. Rinn,et al.  High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue specificity , 2018, bioRxiv.

[30]  Wen‐Teng Chang,et al.  A novel function of transcription factor α-Pal/NRF-1: Increasing neurite outgrowth , 2005 .

[31]  B. Cohen,et al.  High-throughput functional testing of ENCODE segmentation predictions , 2014, Genome research.

[32]  Michael T. McManus,et al.  A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity , 2016, bioRxiv.

[33]  J. Shendure,et al.  Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model , 2013, Nature Genetics.

[34]  Christopher D. Brown,et al.  QuASAR‐MPRA: accurate allele‐specific analysis for massively parallel reporter assays , 2018, Bioinform..

[35]  John G Flannery,et al.  Massively parallel cis-regulatory analysis in the mammalian central nervous system , 2016, Genome research.

[36]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[37]  A. Regev,et al.  Impulse Control: Temporal Dynamics in Gene Transcription , 2011, Cell.

[38]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[39]  Kasper Daniel Hansen,et al.  Linear models enable powerful differential activity analysis in massively parallel reporter assays , 2017 .

[40]  Chengyu Liu,et al.  Transcription factor TEAD2 is involved in neural tube closure , 2007, Genesis.

[41]  David M. McCandlish,et al.  Annual Review of Genomics and Human Genetics Massively Parallel Assays and Quantitative Sequence – Function Relationships , 2019 .

[42]  Joris van Arensbergen,et al.  Systematic identification of human SNPs affecting regulatory element activity , 2018, bioRxiv.

[43]  E. M. Jones,et al.  Multiplexed dissection of a model human transcription factor binding site architecture , 2019, bioRxiv.