cube 2 net : E icient ality Network Construction with Data Cube Organization

Networks are widely used to model structured data and enable various downstream applications. However, in the real world, most data are structureless, and the assumption of a given network for each particular task is o‰en invalid. In this work, given a set of objects, we propose to leverage data cube to organize its enormous ambient data. Upon that, we further provide a reinforcement learning algorithm to automatically explore the cube structure and eciently select appropriate data for the construction of a quality network, which can facilitate various tasks on the given set of objects. With extensive experiments of two classic networkmining tasks on di‚erent real-world large datasets, we show that our proposed cube2net pipeline is general, and much more e‚ective and ecient in quality network construction, compared with other methods without the leverage of data cube or reinforcement learning.

[1]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[2]  Kevin Chen-Chuan Chang,et al.  User profiling in an ego network: co-profiling attributes and relationships , 2014, WWW.

[3]  Deepayan Chakrabarti,et al.  Joint Inference of Multiple Label Types in Large Networks , 2014, ICML.

[4]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[7]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[8]  Wei Zhang,et al.  STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[9]  Bo Zhao,et al.  Text Cube: Computing IR Measures for Multidimensional Text Database Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[10]  Jiawei Han,et al.  SocialCube: A Text Cube Framework for Analyzing Social Media Data , 2012, 2012 International Conference on Social Informatics.

[11]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[12]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[13]  Jiawei Han,et al.  Constructing Structured Information Networks from Massive Text Corpora , 2017, WWW.

[14]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[15]  Jiawei Han,et al.  Multi-Dimensional, Phrase-Based Summarization in Text Cubes , 2016, IEEE Data Eng. Bull..

[16]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[17]  Nicola Barbieri,et al.  Who to follow and why: link prediction with explanations , 2014, KDD.

[18]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[19]  Hado van Hasselt,et al.  Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.

[20]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[21]  Jiawei Han,et al.  Automated Phrase Mining from Massive Text Corpora , 2017, IEEE Transactions on Knowledge and Data Engineering.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[24]  Nicola Barbieri,et al.  Topic-Aware Social Influence Propagation Models , 2012, ICDM.

[25]  Jiawei Han,et al.  Topic Cube: Topic Modeling for OLAP on Multidimensional Text Databases , 2009, SDM.

[26]  Jiawei Han,et al.  MetaPAD: Meta Pattern Discovery from Massive Text Corpora , 2017, KDD.

[27]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[28]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[29]  Samy Bengio,et al.  Order Matters: Sequence to sequence for sets , 2015, ICLR.

[30]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[31]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[32]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[33]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[34]  Yixin Chen,et al.  Weisfeiler-Lehman Neural Machine for Link Prediction , 2017, KDD.

[35]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[36]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[37]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[38]  Hongxia Jin,et al.  Community discovery and profiling with social messages , 2012, KDD.

[39]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[40]  Gao Cong,et al.  An Experimental Evaluation of Point-of-interest Recommendation in Location-based Social Networks , 2017, Proc. VLDB Endow..

[41]  Heng Ji,et al.  EventCube: multi-dimensional search and mining of structured and text data , 2013, KDD.

[42]  Yuan Zhang,et al.  Enhancing the Network Embedding Quality with Structural Similarity , 2017, CIKM.

[43]  Zhen Wang,et al.  Community Detection Based on Structure and Content: A Content Propagation Perspective , 2015, 2015 IEEE International Conference on Data Mining.