Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks

Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web. We propose learning individual representations of people using neural nets to integrate rich linguistic and network evidence gathered from social media. The algorithm is able to combine diverse cues, such as the text a person writes, their attributes (e.g. gender, employer, education, location) and social relations to other people. We show that by integrating both textual and network evidence, these representations offer improved performance at four important tasks in social media inference on Twitter: predicting (1) gender, (2) occupation, (3) location, and (4) friendships for users. Our approach scales to large datasets and the learned representations can be used as general features in and have the potential to benefit a large number of downstream tasks including link prediction, community detection, or probabilistic reasoning over social networks.

[1]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[2]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[3]  Stephen Grossberg,et al.  Recurrent neural networks , 2013, Scholarpedia.

[4]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[5]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[6]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[7]  William Yang Wang,et al.  Programming with personalized pagerank: a locally groundable first-order probabilistic logic , 2013, CIKM.

[8]  David Yarowsky,et al.  Classifying latent user attributes in twitter , 2010, SMUC '10.

[9]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[10]  Gisele L. Pappa,et al.  Inferring the Location of Twitter Messages Based on User Relationships , 2011, Trans. GIS.

[11]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[12]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[13]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[14]  Oren Etzioni,et al.  Modeling Missing Data in Distant Supervision for Information Extraction , 2013, TACL.

[15]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[16]  Jure Leskovec,et al.  Overlapping community detection at scale: a nonnegative matrix factorization approach , 2013, WSDM.

[17]  John Miller,et al.  Traversing Knowledge Graphs in Vector Space , 2015, EMNLP.

[18]  Jacob Ratkiewicz,et al.  Political Polarization on Twitter , 2011, ICWSM.

[19]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[20]  Zornitsa Kozareva,et al.  Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns , 2010, ACL.

[21]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Sebastian Thrun,et al.  Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.

[25]  Stefan C. Kremer,et al.  Recurrent Neural Networks , 2013, Handbook on Neural Information Processing.

[26]  Lakhmi C. Jain,et al.  Recurrent Neural Networks: Design and Applications , 1999 .

[27]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[28]  Nikolaos Aletras,et al.  An analysis of the user occupational class through Twitter content , 2015, ACL.

[29]  Diyi Yang,et al.  That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets , 2015, EMNLP.

[30]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[31]  Henry A. Kautz,et al.  Finding your friends and following them to where you are , 2012, WSDM '12.

[32]  Claire Cardie,et al.  Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns , 2005, HLT.

[33]  Zornitsa Kozareva,et al.  Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds , 2010, NAACL.

[34]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[35]  D. Rao Detecting Latent User Properties in Social Media , 2010 .

[36]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[37]  Eduard H. Hovy,et al.  Weakly Supervised User Profile Extraction from Twitter , 2014, ACL.

[38]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[39]  Huan Liu,et al.  Scalable learning of collective behavior based on sparse social dimensions , 2009, CIKM.

[40]  Richard Socher,et al.  A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.

[41]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[42]  Ernesto Estrada,et al.  Generalization of topological indices , 2001 .

[43]  Ellen Riloff,et al.  Corpus-based Semantic Lexicon Induction with Web-based Corroboration , 2009 .

[44]  Svitlana Volkova,et al.  Inferring User Political Preferences from Streaming Communications , 2014, ACL.

[45]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[46]  Daniel Jurafsky,et al.  Inferring User Preferences by Probabilistic Logical Reasoning over Social Networks , 2014, ArXiv.

[47]  Keith W. Ross,et al.  What's in a Name: A Study of Names, Gender Inference, and Gender Behavior in Facebook , 2011, DASFAA Workshops.

[48]  Owen Rambow,et al.  Sentiment Analysis of Twitter Data , 2011 .

[49]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[50]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[51]  Wendy Liu,et al.  Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[52]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[53]  Dirk Hovy,et al.  User Review Sites as a Resource for Large-Scale Sociolinguistic Studies , 2015, WWW.

[54]  Ruslan Salakhutdinov,et al.  Multimodal Neural Language Models , 2014, ICML.

[55]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[56]  Henry Lieberman,et al.  AnalogySpace: Reducing the Dimensionality of Common Sense Knowledge , 2008, AAAI.

[57]  Filippo Menczer,et al.  The Geospatial Characteristics of a Social Movement Communication Network , 2013, PloS one.

[58]  Koray Kavukcuoglu,et al.  Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[59]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[60]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[61]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[62]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[63]  Ari Rappoport,et al.  Fully Unsupervised Discovery of Concept-Specific Relationships by Web Mining , 2007, ACL.

[64]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Meena Nagarajan,et al.  Proceedings of the Workshop on Languages in Social Media , 2011 .

[66]  Eduard Hovy,et al.  Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text , 2006 .

[67]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[68]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[69]  Ana-Maria Popescu,et al.  A Machine Learning Approach to Twitter User Classification , 2011, ICWSM.

[70]  Claire Cardie,et al.  Major Life Event Extraction from Twitter based on Congratulations/Condolences Speech Acts , 2014, EMNLP.

[71]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[72]  Yoram Bachrach,et al.  Studying User Income through Language, Behaviour and Affect in Social Media , 2015, PloS one.

[73]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[74]  Patrick Paroubek,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2010, LREC.

[75]  Jérôme Kunegis,et al.  Learning spectral graph transformations for link prediction , 2009, ICML '09.

[76]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[77]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[78]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[79]  Jukka-Pekka Onnela,et al.  Geographic Constraints on Social Network Groups , 2010, PloS one.

[80]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[81]  Lise Getoor,et al.  A short introduction to probabilistic soft logic , 2012, NIPS 2012.

[82]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[83]  Jure Leskovec,et al.  Community Detection in Networks with Node Attributes , 2013, 2013 IEEE 13th International Conference on Data Mining.

[84]  Sameer Singh,et al.  Injecting Logical Background Knowledge into Embeddings for Relation Extraction , 2015, NAACL.

[85]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[86]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[87]  Derek Ruths,et al.  Gender Inference of Twitter Users in Non-English Contexts , 2013, EMNLP.

[88]  Tong Zhang,et al.  Solving large scale linear prediction problems using stochastic gradient descent algorithms , 2004, ICML.

[89]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[90]  Claire Cardie,et al.  Joint Inference for Fine-grained Opinion Extraction , 2013, ACL.