HCGA: Highly comparative graph analysis for network phenotyping

Summary Networks are widely used as mathematical models of complex systems across many scientific disciplines. Decades of work have produced a vast corpus of research characterizing the topological, combinatorial, statistical, and spectral properties of graphs. Each graph property can be thought of as a feature that captures important (and sometimes overlapping) characteristics of a network. In this paper, we introduce HCGA, a framework for highly comparative analysis of graph datasets that computes several thousands of graph features from any given network. HCGA also offers a suite of statistical learning and data analysis tools for automated identification and selection of important and interpretable features underpinning the characterization of graph datasets. We show that HCGA outperforms other methodologies on supervised classification tasks on benchmark datasets while retaining the interpretability of network features. We exemplify HCGA by predicting the charge transfer in organic semiconductors and clustering a dataset of neuronal morphology images.

[1]  Jure Leskovec,et al.  GNNExplainer: Generating Explanations for Graph Neural Networks , 2019, NeurIPS.

[2]  Sumeet Agarwal,et al.  Networks in nature : dynamics, evolution, and modularity , 2012 .

[3]  A. Troisi,et al.  Nonlocal Electron-Phonon Coupling in Prototypical Molecular Semiconductors from First Principles. , 2018, Journal of chemical theory and computation.

[4]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[5]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[6]  Donald B. Johnson,et al.  Efficient Algorithms for Shortest Paths in Sparse Networks , 1977, J. ACM.

[7]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[8]  Davide Eynard,et al.  Fake News Detection on Social Media using Geometric Deep Learning , 2019, ArXiv.

[9]  Bryan Perozzi,et al.  Grale: Designing Networks for Graph Learning , 2020, KDD.

[10]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[11]  Mauricio Barahona,et al.  Interest communities and flow roles in directed networks: the Twitter network of the UK riots , 2013, Journal of The Royal Society Interface.

[12]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[13]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[14]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Hugh Chen,et al.  From local explanations to global understanding with explainable AI for trees , 2020, Nature Machine Intelligence.

[16]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[17]  Jean-Charles Delvenne,et al.  Multi-hop assortativities for networks classification , 2019, J. Complex Networks.

[18]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[19]  Jean-Charles Delvenne,et al.  Random Walks, Markov Processes and the Multiscale Modular Organization of Complex Networks , 2014, IEEE Transactions on Network Science and Engineering.

[20]  Yu Chen,et al.  Iterative Deep Graph Learning for Graph Neural Networks: Better and Robust Node Embeddings , 2019, NeurIPS.

[21]  Ben D Fulcher,et al.  Automatic time-series phenotyping using massive feature extraction , 2016 .

[22]  Hao Li,et al.  A generalization of Dirac's theorem on cycles through k vertices in k-connected graphs , 2007, Discret. Math..

[23]  Nikos Komodakis,et al.  Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  D. Helbing,et al.  The Hidden Geometry of Complex, Network-Driven Contagion Phenomena , 2013, Science.

[25]  Alasdair J. Campbell,et al.  Circularly polarized light detection by a chiral organic semiconductor transistor , 2013, Nature Photonics.

[26]  Zenglin Xu,et al.  Robust Graph Learning From Noisy Data , 2018, IEEE Transactions on Cybernetics.

[27]  Mauricio Barahona,et al.  Semi-supervised classification on graphs using explicit diffusion dynamics , 2019, Foundations of Data Science.

[28]  Steven K. Gibb Toxicity testing in the 21st century: a vision and a strategy. , 2008, Reproductive toxicology.

[29]  Jean-Charles Delvenne,et al.  Stability of graph communities across time scales , 2008, Proceedings of the National Academy of Sciences.

[30]  O. Sporns,et al.  Complex brain networks: graph theoretical analysis of structural and functional systems , 2009, Nature Reviews Neuroscience.

[31]  A. Otero-de-la-Roza,et al.  A computational exploration of the crystal energy and charge-carrier mobility landscapes of the chiral [6]helicene molecule. , 2018, Nanoscale.

[32]  Grale , 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.

[33]  Davide Bacciu,et al.  A Fair Comparison of Graph Neural Networks for Graph Classification , 2020, ICLR.

[34]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[35]  Santiago Segarra,et al.  Graph-based Semi-Supervised & Active Learning for Edge Flows , 2019, KDD.

[36]  Palash Goyal,et al.  Graph Embedding Techniques, Applications, and Performance: A Survey , 2017, Knowl. Based Syst..

[37]  M. Randic,et al.  Resistance distance , 1993 .

[38]  Dan Hu,et al.  A toolbox for brain network construction and classification (BrainNetClass) , 2019, Human brain mapping.

[39]  Martin G. Everett,et al.  A Graph-theoretic perspective on centrality , 2006, Soc. Networks.

[40]  Zenglin Xu,et al.  Structured Graph Learning for Clustering and Semi-supervised Classification , 2020, Pattern Recognit..

[41]  Jean-Luc Brédas,et al.  Charge transport in organic semiconductors. , 2007, Chemical reviews.

[42]  A Delmotte,et al.  Protein multi-scale organization through graph partitioning and robustness analysis: application to the myosin–myosin light chain interaction , 2011, Physical biology.

[43]  Daniel J. Brass,et al.  Network Analysis in the Social Sciences , 2009, Science.

[44]  P. Expert,et al.  Geometric graphs from data to aid classification tasks with Graph Convolutional Networks , 2020, Patterns.

[45]  Max A. Little,et al.  Highly comparative time-series analysis: the empirical structure of time series and their methods , 2013, Journal of The Royal Society Interface.

[46]  Pierre Vandergheynst,et al.  Geometric Deep Learning: Going beyond Euclidean data , 2016, IEEE Signal Process. Mag..

[47]  J. Bailar,et al.  Toxicity Testing in the 21st Century: A Vision and a Strategy , 2010, Journal of toxicology and environmental health. Part B, Critical reviews.

[48]  Brian R. Lee,et al.  Classification of electrophysiological and morphological neuron types in the mouse visual cortex , 2019, Nature Neuroscience.

[49]  Nick S. Jones,et al.  catch22: CAnonical Time-series CHaracteristics , 2019, Data Mining and Knowledge Discovery.

[50]  Joan Bruna,et al.  Deep Convolutional Networks on Graph-Structured Data , 2015, ArXiv.

[51]  James G. King,et al.  Reconstruction and Simulation of Neocortical Microcircuitry , 2015, Cell.

[52]  Julia A. Schmidt,et al.  Computational Screening of Organic Semiconductors: Exploring Side-Group Functionalisation and Assembly to Optimise Charge Transport in Chiral Molecules , 2020 .

[53]  Mauricio Barahona,et al.  Unsupervised Graph-Based Learning Predicts Mutations That Alter Protein Dynamics , 2019, bioRxiv.

[54]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[55]  James G. King,et al.  The neocortical microcircuit collaboration portal: a resource for rat somatosensory cortex , 2015, Front. Neural Circuits.

[56]  Peter J. Mucha,et al.  EndNote: Feature-based classification of networks , 2019, Netw. Sci..

[57]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[58]  Dinggang Shen,et al.  Brain Network Construction and Classification Toolbox (BrainNetClass) , 2019, ArXiv.

[59]  Jacob G Foster,et al.  Edge direction and the structure of networks , 2009, Proceedings of the National Academy of Sciences.