Visualization of non-metric relationships by adaptive learning multiple maps t-SNE regularization

Known as phenotypic overlapping, some disease-rel ated symptoms share a common pathologi cal and physiological mechanism. Researchers attempt to visualize the phenotypic relationships between different human diseases from the perspective of machine learning, but traditional visualization methods may be subject to fundamental limitations of metric spaces. Multiple maps t-SNE regularization method, a probabilistic method for visualizing data points in multiple low-dimensional spaces has been proposed to address the limitation. However, the convergence speed is low when apply on the scale dataset. We use the RMSProp with Nesterov momentum method to learn the objective loss function. This method normalize the gradients by applying an exponential moving average of gradient magnitude for each iteration parameter and use Nesterov momentum to counterweigh too high velocities by “peeking ahead” actual objective values in the candidate search direction. This method convergent faster than the original method of convergence speed. Experiments results on several dataset shows that the proposed method outperforms the several version of mm-tSNE with or without regularization, as measured by the neighborhood preservation ratio and error rate. This suggests the modified mm-tSNE regularization can be applied directly in other domain including social, biological and microbiomic datasets.

[1]  Razvan Pascanu,et al.  Advances in optimizing recurrent networks , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Jan Freudenberg,et al.  A similarity-based method for genome-wide prediction of disease-relevant human genes , 2002, ECCB.

[3]  Krishnakumar Balasubramanian,et al.  Dimensionality Reduction for Text using Domain Knowledge , 2010, COLING.

[4]  Xingpeng Jiang,et al.  Visualization of genetic disease-phenotype similarities by multiple maps t-SNE with Laplacian regularization , 2014, BMC Medical Genomics.

[5]  Ruth Nussinov,et al.  Structure and dynamics of molecular networks: A novel paradigm of drug discovery. A comprehensive review , 2012, Pharmacology & therapeutics.

[6]  Yurii Nesterov,et al.  Lectures on Convex Optimization , 2018 .

[7]  Geoffrey E. Hinton,et al.  Training Recurrent Neural Networks , 2013 .

[8]  Xianchao Zhu,et al.  Visualization of disease relationships by multiple maps t-SNE regularization based on Nesterov accelerated gradient , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[9]  Qiang Feng,et al.  A metagenome-wide association study of gut microbiota in type 2 diabetes , 2012, Nature.

[10]  Susumu Goto,et al.  The commonality of protein interaction networks determined in neurodegenerative disorders (NDDs) , 2007, Bioinform..

[11]  S. Mundlos,et al.  A new subtype of brachydactyly type B caused by point mutations in the bone morphogenetic protein antagonist NOGGIN. , 2007, American journal of human genetics.

[12]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[13]  Jacob Sosna,et al.  P35S mutation in the NOG gene associated with Teunissen-Cremers syndrome and features of multiple NOG joint-fusion syndromes. , 2008, European journal of medical genetics.

[14]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[15]  T. Jiang,et al.  Modularity in the genetic disease‐phenotype network , 2008, FEBS letters.

[16]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[17]  Peilin Jia,et al.  Multi-Dimensional Prioritization of Dental Caries Candidate Genes and Its Enriched Dense Network Modules , 2013, PloS one.

[18]  H. Brunner,et al.  From syndrome families to functional genomics , 2004, Nature Reviews Genetics.

[19]  Geoffrey E. Hinton,et al.  Visualizing non-metric similarities in multiple maps , 2011, Machine Learning.

[20]  A. Barabasi,et al.  Human disease classification in the postgenomic era: A complex systems approach to human pathobiology , 2007, Molecular systems biology.

[21]  D. Valle,et al.  Online Mendelian Inheritance In Man (OMIM) , 2000, Human mutation.

[22]  Volkan Cevher,et al.  Stochastic Spectral Descent for Restricted Boltzmann Machines , 2015, AISTATS.

[23]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[24]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.