Constrained Co-embedding Model for User Profiling in Question Answering Communities

In this paper, we study the problem of user profiling in question answering communities. We address the problem by proposing a constrained co-embedding model (CCEM). CCEM jointly infers the embeddings of both users and words in question answering communities such that the similarities between users and words can be semantically measured. Our CCEM works with constraints which enforce the inferred embeddings of users and words subject to this criteria: given a question in the community, embeddings of users whose answers receive more votes are closer to the embeddings of the words occurring in these answers, compared to the embeddings of those whose answers receive less votes. Experiments on a Chinese dataset, Zhihu dataset, demonstrate that our proposed co-embedding algorithm outperforms state-of-the-art methods in the task of user profiling.

[1]  M. de Rijke,et al.  Determining Expert Profiles (With an Application to Expert Finding) , 2007, IJCAI.

[2]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[3]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[4]  Adrian Popescu,et al.  User profiling for answer quality assessment in Q&A communities , 2013, DUBMOD '13.

[5]  Jie Tang,et al.  Representation Learning for Attributed Multiplex Heterogeneous Network , 2019, KDD.

[6]  Joshua F. Wiley R Deep Learning Essentials , 2016 .

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[9]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[12]  Shangsong Liang,et al.  Collaborative, Dynamic and Diversified User Profiling , 2019, AAAI.

[13]  Mark Levene,et al.  Search Engines: Information Retrieval in Practice , 2011, Comput. J..

[14]  Hai Zhao,et al.  Deep Enhanced Representation for Implicit Discourse Relation Recognition , 2018, COLING.

[15]  Ronan Collobert,et al.  Word Embeddings through Hellinger PCA , 2013, EACL.

[16]  Evangelos Kanoulas,et al.  Collaboratively Tracking Interests for User Clustering in Streams of Short Texts , 2019, IEEE Transactions on Knowledge and Data Engineering.

[17]  Jing Li,et al.  Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings , 2018, NAACL.

[18]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[19]  Evangelos Kanoulas,et al.  Dynamic Clustering of Streaming Short Documents , 2016, KDD.

[20]  Evangelos E. Milios,et al.  Finding expert users in community question answering , 2012, WWW.

[21]  Nick Craswell,et al.  Overview of the TREC 2005 Enterprise Track , 2005, TREC.

[22]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[23]  Xiangliang Zhang,et al.  Dynamic Embeddings for User Profiling in Twitter , 2018, KDD.

[24]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[25]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[26]  M. de Rijke,et al.  Broad expertise retrieval in sparse data environments , 2007, SIGIR.

[27]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[28]  Shangsong Liang,et al.  Dynamic User Profiling for Streams of Short Texts , 2018, AAAI.

[29]  M. de Rijke,et al.  On the Evaluation of Entity Profiles , 2010, CLEF.

[30]  M. de Rijke,et al.  Formal language models for finding groups of experts , 2016, Inf. Process. Manag..

[31]  Jian Li,et al.  NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization , 2019, WWW.

[32]  Krisztian Balog,et al.  Temporal Expertise Profiling , 2014, ECIR.

[33]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[34]  Yue Lu,et al.  Exploiting user profile information for answer ranking in cQA , 2012, WWW.

[35]  Yi Fang,et al.  Modeling the dynamics of personal expertise , 2014, SIGIR.

[36]  Xiangliang Zhang,et al.  Co-Embedding Attributed Networks , 2019, WSDM.

[37]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[38]  Xu Sun,et al.  Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection , 2012, ACL.

[39]  Jennifer Neville,et al.  TransConv: Relationship Embedding in Social Networks , 2019, AAAI.