OpenWGL: Open-World Graph Learning

In traditional graph learning tasks, such as node classification, learning is carried out in a closed-world setting where the number of classes and their training samples are provided to help train models, and the learning goal is to correctly classify unlabeled nodes into classes already known. In reality, due to limited labeling capability and dynamic evolving of networks, some nodes in the networks may not belong to any existing/seen classes, and therefore cannot be correctly classified by closed-world learning algorithms. In this paper, we propose a new open-world graph learning paradigm, where the learning goal is to not only classify nodes belonging to seen classes into correct groups, but also classify nodes not belonging to existing classes to an unseen class. The essential challenge of the open-world graph learning is that (1) unseen class has no labeled samples, and may exist in an arbitrary form different from existing seen classes; and (2) both graph feature learning and prediction should differentiate whether a node may belong to an existing/seen class or an unseen class. To tackle the challenges, we propose an uncertain node representation learning approach, using constrained variational graph autoencoder networks, where the label loss and class uncertainty loss constraints are used to ensure that the node representation learning are sensitive to unseen class. As a result, node embedding features are denoted by distributions, instead of deterministic feature vectors. By using a sampling process to generate multiple versions of feature vectors, we are able to test the certainty of a node belonging to seen classes, and automatically determine a threshold to reject nodes not belonging to seen classes as unseen class nodes. Experiments on real-world networks demonstrate the algorithm performance, comparing to baselines. Case studies and ablation analysis also show the rationale of our design for open-world graph learning.

[1]  Raymond T. Ng,et al.  Finding Intensional Knowledge of Distance-Based Outliers , 1999, VLDB.

[2]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[3]  Alexander Gammerman,et al.  Transductive Confidence Machines for Pattern Recognition , 2002, ECML.

[4]  Bianca Zadrozny,et al.  Transforming classifier scores into accurate multiclass probability estimates , 2002, KDD.

[5]  José Manoel de Seixas,et al.  Enlarging neural class detection capacity in passive sonar systems , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[6]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[7]  Eduardo J Spinosa,et al.  Support vector machines for novel class detection in Bioinformatics. , 2005, Genetics and molecular research : GMR.

[8]  Ishwar K. Sethi,et al.  Confidence-based classifier design , 2006, Pattern Recognit..

[9]  Cheong Hee Park,et al.  On Detecting an Emerging Class , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[10]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[11]  Anderson Rocha,et al.  Toward Open Set Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  Terrance E. Boult,et al.  Probability Models for Open Set Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Terrance E. Boult,et al.  Multi-class Open Set Recognition Using Probability of Inclusion , 2014, ECCV.

[15]  Bing Liu,et al.  Social Media Text Classification under Negative Covariate Shift , 2015, EMNLP.

[16]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[17]  Bing Liu,et al.  Lifelong machine learning: a paradigm for continuous learning , 2017, Frontiers of Computer Science.

[18]  Shuai Wang,et al.  Learning Cumulatively to Become More Knowledgeable , 2016, KDD.

[19]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[20]  Chengqi Zhang,et al.  Tri-Party Deep Network Representation , 2016, IJCAI.

[21]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[22]  Lei Shu,et al.  DOC: Deep Open Classification of Text Documents , 2017, EMNLP.

[23]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[24]  Hwanjo Yu,et al.  DILOF: Effective and Memory Efficient Local Outlier Detection in Data Streams , 2018, KDD.

[25]  Ivor W. Tsang,et al.  Long-short Distance Aggregation Networks for Positive Unlabeled Graph Learning , 2019, CIKM.

[26]  Philip S. Yu,et al.  Open-world Learning and Application to Product Classification , 2018, WWW.

[27]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Lei Pan,et al.  Domain-Adversarial Graph Neural Networks for Text Classification , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[29]  Yuan Jiang,et al.  Nearest Neighbor Ensembles: An Effective Method for Difficult Problems in Streaming Classification with Emerging New Classes , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[30]  Chuan Zhou,et al.  GSSNN: Graph Smoothing Splines Neural Networks , 2020, AAAI.

[31]  Shirui Pan,et al.  Unsupervised Domain Adaptive Graph Convolutional Networks , 2020, WWW.

[32]  Chengqi Zhang,et al.  Learning Graph Embedding With Adversarial Training Methods , 2019, IEEE Transactions on Cybernetics.

[33]  Zhi-Hua Zhou,et al.  Multi-Instance Learning With Emerging Novel Class , 2021, IEEE Transactions on Knowledge and Data Engineering.

[34]  Bhavani Thuraisingham,et al.  SACCOS: A Semi-Supervised Framework for Emerging Class Detection and Concept Drift Adaption Over Data Streams , 2022, IEEE Transactions on Knowledge and Data Engineering.