Exploring Expert Cognition for Attributed Network Embedding

Attributed network embedding has been widely used in modeling real-world systems. The obtained low-dimensional vector representations of nodes preserve their proximity in terms of both network topology and node attributes, upon which different analysis algorithms can be applied. Recent advances in explanation-based learning and human-in-the-loop models show that by involving experts, the performance of many learning tasks can be enhanced. It is because experts have a better cognition in the latent information such as domain knowledge, conventions, and hidden relations. It motivates us to employ experts to transform their meaningful cognition into concrete data to advance network embedding. However, learning and incorporating the expert cognition into the embedding remains a challenging task. Because expert cognition does not have a concrete form, and is difficult to be measured and laborious to obtain. Also, in a real-world network, there are various types of expert cognition such as the comprehension of word meaning and the discernment of similar nodes. It is nontrivial to identify the types that could lead to a significant improvement in the embedding. In this paper, we study a novel problem of exploring expert cognition for attributed network embedding and propose a principled framework NEEC. We formulate the process of learning expert cognition as a task of asking experts a number of concise and general queries. Guided by the exemplar theory and prototype theory in cognitive science, the queries are systematically selected and can be generalized to various real-world networks. The returned answers from the experts contain their valuable cognition. We model them as new edges and directly add into the attributed network, upon which different embedding methods can be applied towards a more informative embedding representation. Experiments on real-world datasets verify the effectiveness and efficiency of NEEC.

[1]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[2]  Huan Liu,et al.  Multi-Label Informed Feature Selection , 2016, IJCAI.

[3]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[4]  Hady Wirawan Lauw,et al.  Probabilistic Latent Document Network Embedding , 2014, 2014 IEEE International Conference on Data Mining.

[5]  W. Karwowski International encyclopedia of ergonomics and human factors , 2001 .

[6]  Heng Ji,et al.  Exploring Context and Content Links in Social Media: A Latent Space Method , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Robert Davies-Jones,et al.  A review of supercell and tornado dynamics , 2015 .

[8]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[9]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[10]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[11]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..

[12]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[13]  Thomas J. Walsh,et al.  Exploring compact reinforcement-learning representations with linear regression , 2009, UAI.

[14]  Xiao Huang,et al.  Accelerated Local Anomaly Detection via Resolving Attributed Networks , 2017, IJCAI.

[15]  Xiao Huang,et al.  Label Informed Attributed Network Embedding , 2017, WSDM.

[16]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[17]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[18]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[19]  Huan Liu,et al.  Unsupervised Streaming Feature Selection in Social Media , 2015, CIKM.

[20]  Ari Rappoport,et al.  What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities , 2012, WSDM '12.

[21]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986, Journal of experimental psychology. General.

[22]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[23]  Gerald DeJong,et al.  Explanation-Based Learning , 2014, Encyclopedia of Machine Learning and Data Mining.

[24]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[25]  Elke Achtert,et al.  Interactive data mining with 3D-parallel-coordinate-trees , 2013, SIGMOD '13.

[26]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[27]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[29]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[30]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[31]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[32]  Rong Jin,et al.  Large-scale text categorization by batch mode active learning , 2006, WWW '06.

[33]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[34]  Huazheng Wang,et al.  Learning Hidden Features for Contextual Bandits , 2016, CIKM.

[35]  Mikhail Belkin,et al.  On the Relation Between Low Density Separation, Spectral Clustering and Graph Cuts , 2006, NIPS.

[36]  Qing Wang,et al.  Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit , 2016, KDD.

[37]  R. Nosofsky Attention, similarity, and the identification-categorization relationship. , 1986 .

[38]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[39]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[40]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[41]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[42]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[43]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[44]  Romain Laroche,et al.  Contextual Bandit for Active Learning: Active Thompson Sampling , 2014, ICONIP.

[45]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[46]  Xiao Huang,et al.  Accelerated Attributed Network Embedding , 2017, SDM.

[47]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[48]  Yongfeng Zhang,et al.  Incorporating Phrase-level Sentiment Analysis on Textual Reviews for Personalized Recommendation , 2015, WSDM.

[49]  R. Agrawal Sample mean based index policies by O(log n) regret for the multi-armed bandit problem , 1995, Advances in Applied Probability.

[50]  George Hripcsak,et al.  Research Paper: The Role of Domain Knowledge in Automating Medical Text Report Classification , 2003, J. Am. Medical Informatics Assoc..