TUDataset: A collection of benchmark datasets for learning with graphs

Recently, there has been an increasing interest in (supervised) learning with graph data, especially using graph neural networks. However, the development of meaningful benchmark datasets and standardized evaluation procedures is lagging, consequently hindering advancements in this area. To address this, we introduce the TUDataset for graph classification and regression. The collection consists of over 120 datasets of varying sizes from a wide range of applications. We provide Python-based data loaders, kernel and graph neural network baseline implementations, and evaluation tools. Here, we give an overview of the datasets, standardized evaluation procedures, and provide baseline experiments. All datasets are available at this http URL. The experiments are fully reproducible from the code available at this http URL.

[1]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[2]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[3]  Rik Sarkar,et al.  Karate Club: An API Oriented Open-Source Python Framework for Unsupervised Learning on Graphs , 2020, CIKM.

[4]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[5]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[6]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[7]  Dmitri B. Kireev,et al.  ChemNet: A Novel Neural Network Based Method for Graph/Property Mapping , 1995, J. Chem. Inf. Comput. Sci..

[8]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[9]  Christopher R'e,et al.  Machine Learning on Graphs: A Model and Comprehensive Taxonomy , 2020, ArXiv.

[10]  E. David,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World , 2010 .

[11]  Thomas Lengauer,et al.  Automatic Generation of Complementary Descriptors with Molecular Graph Networks , 2005, J. Chem. Inf. Model..

[12]  Karsten M. Borgwardt,et al.  A Persistent Weisfeiler-Lehman Procedure for Graph Classification , 2019, ICML.

[13]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[14]  Kristian Kersting,et al.  Faster Kernels for Graphs with Continuous Attributes via Hashing , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[15]  Michalis Vazirgiannis,et al.  A Degeneracy Framework for Graph Similarity , 2018, IJCAI.

[16]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[17]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[18]  Pavlo O. Dral,et al.  Quantum chemistry structures and properties of 134 kilo molecules , 2014, Scientific Data.

[19]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[20]  Nikos Komodakis,et al.  Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Pengfei Chen,et al.  Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models , 2019, ArXiv.

[22]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[23]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[24]  Jure Leskovec,et al.  Open Graph Benchmark: Datasets for Machine Learning on Graphs , 2020, NeurIPS.

[25]  Yoshua Bengio,et al.  Benchmarking Graph Neural Networks , 2023, J. Mach. Learn. Res..

[26]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[27]  Michalis Vazirgiannis,et al.  GraKeL: A Graph Kernel Library in Python , 2018, J. Mach. Learn. Res..

[28]  Stephan Günnemann,et al.  Directional Message Passing for Molecular Graphs , 2020, ICLR.

[29]  Risi Kondor,et al.  Cormorant: Covariant Molecular Neural Networks , 2019, NeurIPS.

[30]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[31]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[32]  Karsten M. Borgwardt,et al.  Wasserstein Weisfeiler-Lehman Graph Kernels , 2019, NeurIPS.

[33]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[34]  Mark Heimann,et al.  Distribution of Node Embeddings as Multiresolution Features for Graphs , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[35]  Davide Bacciu,et al.  A Fair Comparison of Graph Neural Networks for Graph Classification , 2020, ICLR.

[36]  Nils M. Kriege,et al.  On Valid Optimal Assignment Kernels and Applications to Graph Classification , 2016, NIPS.

[37]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[38]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[39]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[40]  Lutz Oettershagen,et al.  Temporal Graph Kernels for Classifying Dissemination Processes , 2020, SDM.

[41]  Jonathan Masci,et al.  Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Chris Arney,et al.  Networks, Crowds, and Markets: Reasoning about a Highly Connected World (Easley, D. and Kleinberg, J.; 2010) [Book Review] , 2013, IEEE Technology and Society Magazine.

[43]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[44]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[45]  Christian Sohler,et al.  A Property Testing Framework for the Theoretical Expressivity of Graph Kernels , 2018, IJCAI.

[46]  Marleen de Bruijne,et al.  Scalable kernels for graphs with continuous attributes , 2013, NIPS.

[47]  Alex Pentland,et al.  Reality mining: sensing complex social systems , 2006, Personal and Ubiquitous Computing.

[48]  Aristides Gionis,et al.  Reconstructing an Epidemic Over Time , 2016, KDD.

[49]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[50]  Nils M. Kriege,et al.  A survey on graph kernels , 2019, Applied Network Science.

[51]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[52]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[53]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[54]  Jie Chen,et al.  IPC: A Benchmark Data Set for Learning with Graph-Structured Data , 2019, ArXiv.

[55]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[56]  Karsten M. Borgwardt,et al.  Halting in Random Walk Kernels , 2015, NIPS.

[57]  Mohamed R. Amer,et al.  Understanding Attention and Generalization in Graph Neural Networks , 2019, NeurIPS.

[58]  Alessandro Sperduti,et al.  Supervised neural networks for the classification of structures , 1997, IEEE Trans. Neural Networks.

[59]  Kristian Kersting,et al.  A unifying view of explicit and implicit feature maps of graph kernels , 2017, Data Mining and Knowledge Discovery.

[60]  Nils M. Kriege,et al.  Recognizing Cuneiform Signs Using Graph Based Methods , 2018, COST@SDM.

[61]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[62]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[63]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[64]  Yizhou Sun,et al.  Are Powerful Graph Neural Nets Necessary? A Dissection on Graph Classification , 2019, ArXiv.

[65]  Ciro Cattuto,et al.  What's in a crowd? Analysis of face-to-face behavioral networks , 2010, Journal of theoretical biology.

[66]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[67]  Risi Kondor,et al.  The Multiscale Laplacian Graph Kernel , 2016, NIPS.

[68]  Yann LeCun,et al.  Spectral Networks and Deep Locally Connected Networks on Graphs , 2014 .

[69]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[70]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[71]  Ashwin Srinivasan,et al.  The Predictive Toxicology Challenge 2000-2001 , 2001, Bioinform..

[72]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[73]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[74]  Kristian Kersting,et al.  Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[75]  Alexander J. Smola,et al.  Deep Graph Library: Towards Efficient and Scalable Deep Learning on Graphs , 2019, ArXiv.

[76]  Roman Garnett,et al.  Propagation kernels: efficient graph kernels from propagated information , 2015, Machine Learning.