An integrative approach to analyze microarray datasets for prioritization of genes relevant to lens biology and disease

Microarray-based profiling represents an effective method to analyze cellular or tissue-specific gene expression on the genome-level. However, in comparative analyses between control and mutant samples, microarrays often identify a large number of differentially expressed genes, in turn making it challenging to isolate the select “high-priority candidates” that are most relevant to an observed mutant phenotype. Here, we describe an integrative approach for mouse mutant lens microarray gene expression analysis using publically accessible systems-level information such as wild-type mouse lens expression data in iSyTE (integrated Systems Tool for Eye gene discovery), protein–protein interaction data in public databases, gene ontology enrichment data, and transcription factor binding profile data. This strategy, when applied to small Maf Mafg −/−:Mafk +/− mouse lens microarray datasets (deposited in NCBI Gene Expression Omnibus database with accession number GSE65500) in Agrawal et al. 2015 [1], led to the effective prioritization of candidate genes linked to lens defects in these mutants. Indeed, from the original list of genes that are differentially expressed at ± 1.5-fold and p < 0.05 in Mafg −/−:Mafk +/− mutant lenses, this analysis led to the identification of thirty-six high-priority candidates, in turn reducing the number of genes for further study by approximately 1/3 of the total. Moreover, eight of these genes are linked to mammalian cataract in the published literature, validating the efficacy of this approach. Additionally, these high-priority candidates contribute valuable information for the assembly of a gene regulatory network in the lens. In sum, the pipeline outlined in this report represents an effective approach for initial as well as downstream microarray expression data analysis to identify genes important for lens biology and cataracts. We anticipate that this integrative strategy can be extended to prioritize phenotypically relevant candidate genes from microarray data in other cells and tissues.

[1]  R. Maas,et al.  Pax6- and Six3-Mediated Induction of Lens Cell Fate in Mouse and Human ES Cells , 2014, PloS one.

[2]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[3]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[4]  Shawn W. Polson,et al.  Development of novel filtering criteria to analyze RNA-sequencing data obtained from the murine ocular lens during embryogenesis , 2014, Genomics data.

[5]  S. Lachke,et al.  Molecular characterization of mouse lens epithelial cell lines and their suitability to study RNA granules and cataract associated genes. , 2015, Experimental eye research.

[6]  Peter J Park,et al.  iSyTE: integrated Systems Tool for Eye gene discovery. , 2012, Investigative ophthalmology & visual science.

[7]  Jun Miyoshi,et al.  The cell adhesion gene PVRL3 is associated with congenital ocular defects , 2011, Human Genetics.

[8]  James Douglas Engel,et al.  Integration and diversity of the regulatory network composed of Maf and CNC families of transcription factors. , 2002, Gene.

[9]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[10]  Jie Zhang,et al.  Roles of the 15-kDa Selenoprotein (Sep15) in Redox Homeostasis and Cataract Development Revealed by the Analysis of Sep 15 Knockout Mice* , 2011, The Journal of Biological Chemistry.

[11]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[12]  K. Nakayama,et al.  Nrf2–MafG heterodimers contribute globally to antioxidant and metabolic networks , 2012, Nucleic acids research.

[13]  Brad T. Sherman,et al.  Extracting Biological Meaning from Large Gene Lists with DAVID , 2009, Current protocols in bioinformatics.

[14]  David J. Arenillas,et al.  Global mapping of binding sites for Nrf2 identifies novel targets in cell survival response through ChIP-Seq profiling and network analysis , 2010, Nucleic acids research.

[15]  Michelle R. Campbell,et al.  Identification of novel NRF2-regulated genes by ChIP-Seq: influence on retinoid X receptor alpha , 2012, Nucleic acids research.

[16]  R. Maas,et al.  Building the developmental oculome: systems biology in vertebrate eye development and disease , 2010, Wiley interdisciplinary reviews. Systems biology and medicine.

[17]  Melinda K. Duncan,et al.  Loss of Sip1 leads to migration defects and retention of ectodermal markers during lens development , 2014, Mechanisms of Development.

[18]  Fowzan S Alkuraya,et al.  Mutations in the RNA Granule Component TDRD7 Cause Cataract and Glaucoma , 2011, Science.

[19]  Jian Sun,et al.  Histone posttranslational modifications and cell fate determination: lens induction requires the lysine acetyltransferases CBP and p300 , 2013, Nucleic acids research.

[20]  Masayuki Yamamoto,et al.  Compound mouse mutants of bZIP transcription factors Mafg and Mafk reveal a regulatory network of non-crystallin genes associated with cataract , 2015, Human Genetics.

[21]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.