CCGL: Contrastive Cascade Graph Learning

Supervised learning, while prevalent for information cascade modeling, often requires abundant labeled data in training, and the trained model is not easy to generalize across tasks and datasets. Semi-supervised learning facilitates unlabeled data for cascade understanding in pre-training. It often learns fine-grained feature-level representations, which can easily result in overfitting for downstream tasks. Recently, contrastive self-supervised learning is designed to alleviate these two fundamental issues in linguistic and visual tasks. However, its direct applicability for cascade modeling, especially graph cascade related tasks, remains underexplored. In this work, we present Contrastive Cascade Graph Learning (CCGL), a novel framework for cascade graph representation learning in a contrastive, self-supervised, and task-agnostic way. In particular, CCGL first designs an effective data augmentation strategy to capture variation and uncertainty. Second, it learns a generic model for graph cascade tasks via self-supervised contrastive pre-training using both unlabeled and labeled data. Third, CCGL learns a task-specific cascade model via fine-tuning using labeled data. Finally, to make the model transferable across datasets and cascade applications, CCGL further enhances the model via distillation using a teacher-student architecture. We demonstrate that CCGL significantly outperforms its supervised and semi-supervised counterparts for several downstream tasks.

[1]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[2]  Jure Leskovec,et al.  SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity , 2015, KDD.

[3]  Oriol Vinyals,et al.  Representation Learning with Contrastive Predictive Coding , 2018, ArXiv.

[4]  Geoffrey E. Hinton,et al.  A Simple Framework for Contrastive Learning of Visual Representations , 2020, ICML.

[5]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[6]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[7]  Yu Zhang,et al.  Transfer Learning via Learning to Transfer , 2018, ICML.

[8]  Xueqi Cheng,et al.  DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades , 2017, CIKM.

[9]  Yuxiao Dong,et al.  GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training , 2020, KDD.

[10]  Yizhou Sun,et al.  GPT-GNN: Generative Pre-Training of Graph Neural Networks , 2020, KDD.

[11]  Chaoqi Yang,et al.  Using Survival Theory in Early Pattern Detection for Viral Cascades , 2020 .

[12]  Filippo Menczer,et al.  Virality Prediction and Community Structure in Social Networks , 2013, Scientific Reports.

[13]  Sen Wang,et al.  A Comparative Study of Transactional and Semantic Approaches for Predicting Cascades on Twitter , 2018, IJCAI.

[14]  R Devon Hjelm,et al.  Learning Representations by Maximizing Mutual Information Across Views , 2019, NeurIPS.

[15]  Fan Zhou,et al.  Information Diffusion Prediction via Recurrent Cascades Convolution , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[16]  Cheng Li,et al.  DeepCas: An End-to-end Predictor of Information Cascades , 2016, WWW.

[17]  Shirui Pan,et al.  Unsupervised Domain Adaptive Graph Convolutional Networks , 2020, WWW.

[18]  Jure Leskovec,et al.  Strategies for Pre-training Graph Neural Networks , 2020, ICLR.

[19]  Pietro Liò,et al.  Deep Graph Infomax , 2018, ICLR.

[20]  Yoshua Bengio,et al.  Learning deep representations by mutual information estimation and maximization , 2018, ICLR.

[21]  Kaiming He,et al.  Momentum Contrast for Unsupervised Visual Representation Learning , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[23]  Kaveh Hassani,et al.  Contrastive Multi-View Representation Learning on Graphs , 2020, ICML.

[24]  Stella X. Yu,et al.  Unsupervised Feature Learning via Non-parametric Instance Discrimination , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Jian Tang,et al.  InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization , 2019, ICLR.

[26]  Huawei Shen,et al.  Popularity Prediction on Social Platforms with Coupled Graph Neural Networks , 2020, WSDM.

[27]  Kunpeng Zhang,et al.  A Survey of Information Cascade Analysis: Models, Predictions, and Recent Advances , 2021, ACM Computing Surveys (CSUR).

[28]  Geoffrey E. Hinton,et al.  Big Self-Supervised Models are Strong Semi-Supervised Learners , 2020, NeurIPS.

[29]  Jie Zhou,et al.  Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View , 2020, AAAI.

[30]  Goce Trajcevski,et al.  Variational Information Diffusion for Probabilistic Cascades Prediction , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.

[31]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[32]  Leonardo Neves,et al.  Data Augmentation for Graph Neural Networks , 2021, AAAI.

[33]  Kaiming He,et al.  Improved Baselines with Momentum Contrastive Learning , 2020, ArXiv.

[34]  Cécile Favre,et al.  Information diffusion in online social networks: a survey , 2013, SGMD.

[35]  Quoc V. Le,et al.  AutoAugment: Learning Augmentation Strategies From Data , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Tingyang Xu,et al.  DropEdge: Towards Deep Graph Convolutional Networks on Node Classification , 2020, ICLR.