Analysis of Gene Expression Time Series Data of Ebola Vaccine response using the NeuCube and Temporal Feature Selection

The purpose of this paper was to investigate a pipeline for processing temporal gene expression data using spiking neural networks and temporal feature selection techniques that would allow for genomic marker discovery. A promising temporal feature selection method was tested using the NeuCube for classification against a set of previously identified genes using a dataset from Ebola vaccine trials. Classification results from the temporal selection method and the NeuCube model were significantly better than when using previously published gene sets. The discovered gene markers and their corresponding gene interaction network (GIN) are also new and have not been published before. This demonstrates both the potential of the examined feature selection method, and how Spiking Neural Networks (SNN) can be used for time series modelling and the discovery of novel GIN’s. Future work includes improving temporal feature selection methods for gene expression data, and refining the use of SNN’s for time series analysis.

[1]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[2]  E. Gehan,et al.  The properties of high-dimensional data spaces: implications for exploring gene and protein expression data , 2008, Nature Reviews Cancer.

[3]  Gregory Piatetsky-Shapiro,et al.  High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .

[4]  Claudio A. Perez,et al.  Gender Classification Based on Fusion of Different Spatial Scale Features Selected by Mutual Information From Histogram of LBP, Intensity, and Shape , 2013, IEEE Transactions on Information Forensics and Security.

[5]  X. Puente,et al.  Human and mouse proteases: a comparative genomic approach , 2003, Nature Reviews Genetics.

[6]  Kenneth W. Bauer,et al.  Feature screening using signal-to-noise ratios , 2000, Neurocomputing.

[7]  Eamonn J. Keogh,et al.  Curse of Dimensionality , 2010, Encyclopedia of Machine Learning.

[8]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[9]  Nikola K. Kasabov,et al.  NeuCube: A spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data , 2014, Neural Networks.

[10]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[11]  Satchidananda Panda,et al.  An array of insights: application of DNA chip technology in the study of cell biology. , 2003, Trends in cell biology.

[12]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[13]  Heinz Feldmann,et al.  Live attenuated recombinant vaccine protects nonhuman primates against Ebola and Marburg viruses , 2005, Nature Medicine.

[14]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[15]  Alfonso Valencia,et al.  The shrinking human protein coding complement: are there now fewer than 20,000 genes? , 2013, bioRxiv.

[16]  Steven L Salzberg,et al.  Between a chicken and a grape: estimating the number of human genes , 2010, Genome Biology.

[17]  Mohamed F. Ghalwash,et al.  Minimum redundancy maximum relevance feature selection approach for temporal gene expression data , 2017, BMC Bioinformatics.

[18]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[19]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[20]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[21]  Livio Pellizzoni,et al.  Identification and Characterization of Gemin7, a Novel Component of the Survival of Motor Neuron Complex* , 2002, The Journal of Biological Chemistry.

[22]  Ron Meir,et al.  Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule , 2007, Neural Computation.

[23]  Rodolphe Thiébaut,et al.  Systems Vaccinology Identifies an Early Innate Immune Signature as a Correlate of Antibody Responses to the Ebola Vaccine rVSV-ZEBOV , 2017, Cell Reports.

[24]  J. Reis-Filho Next-generation sequencing , 2009, Breast Cancer Research.

[25]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[26]  F. J. Anscombe,et al.  Graphs in Statistical Analysis , 1973 .

[27]  Jie Yang,et al.  Mapping Temporal Variables Into the NeuCube for Improved Pattern Recognition, Predictive Modeling, and Understanding of Stream Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Y Sakaki,et al.  The human MCP-2 gene (SCYA8): cloning, sequence analysis, tissue expression, and assignment to the CC chemokine gene contig on chromosome 17q11.2. , 1997, Genomics.

[29]  Matthias Mann,et al.  Bioinformatics analysis of mass spectrometry‐based proteomics data sets , 2009, FEBS letters.

[30]  Hans-Peter Kriegel,et al.  A survey on unsupervised outlier detection in high‐dimensional numerical data , 2012, Stat. Anal. Data Min..

[31]  Anthony N. Burkitt,et al.  A Review of the Integrate-and-fire Neuron Model: I. Homogeneous Synaptic Input , 2006, Biological Cybernetics.

[32]  Zheng Li,et al.  Short time-series microarray analysis: Methods and challenges , 2008, BMC Systems Biology.

[33]  Alan Lloyd,et al.  Temporal patterns of gene expression in developing maize endosperm identified through transcriptome sequencing , 2014, Proceedings of the National Academy of Sciences.

[34]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[35]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[36]  Eugene M. Izhikevich,et al.  Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.

[37]  Allan R. Jones,et al.  The Allen Human Brain Atlas Comprehensive gene expression mapping of the human brain , 2012, Trends in Neurosciences.

[38]  J. Martinez-Barbera,et al.  Mutations in the homeobox gene HESX1/Hesx1 associated with septo-optic dysplasia in human and mouse , 1998, Nature Genetics.

[39]  Susan P. Worner,et al.  NeuCube(ST) for spatio-temporal data predictive modelling with a case study on ecological data , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[40]  Kristóf Marussy,et al.  Hubness-Aware Classification, Instance Selection and Feature Construction: Survey and Extensions to Time-Series , 2015, Feature Selection for Data and Pattern Recognition.

[41]  T. Mohandas,et al.  Localization of monocyte chemotactic protein-1 gene (SCYA2) to human chromosome 17q11.2-q21.1. , 1991, Genomics.