Information Retrieval: 25th China Conference, CCIR 2019, Fuzhou, China, September 20–22, 2019, Proceedings

This paper introduces a novel method for mining user profiles (e.g., age, gender) using the query log in a search engine. The proposed method combines the advantage of the neural network for representation learning and that of the topic model for interpretability. This is achieved by plugging a parametric Gaussian mixture distribution layer into the neural network. Specifically, it first uses the popular convolution neural network to model the query content, generating a dense vector presentation for each query. Based on this representation, it infers the searching topic of the query, by fitting a Gaussian mixture distribution, and obtains the query topic distribution. Then, it deduces the distribution of topics that the user cares about by aggregating the query topic distribution of all the queries of the user. Profile prediction is performed based on the resulting user topic distribution. We evaluated this framework using a real search engine data set, which contains 40,000 labeled users with age, gender, and education level profiles. The experiment results demonstrated the effectiveness of our proposed model.

[1]  Tie-Yan Liu,et al.  Listwise approach to learning to rank: theory and algorithm , 2008, ICML '08.

[2]  Ian Maddieson,et al.  On the universal structure of human lexical semantics , 2015, Proceedings of the National Academy of Sciences.

[3]  Heiner Stuckenschmidt,et al.  Fine-Grained Evaluation of Rule- and Embedding-Based Systems for Knowledge Graph Completion , 2018, SEMWEB.

[4]  Xin Yao,et al.  On the Effectiveness of Sampling for Evolutionary Optimization in Noisy Environments , 2014, Evolutionary Computation.

[5]  Dai Quoc Nguyen,et al.  A Capsule Network-based Embedding Model for Search Personalization , 2018, ArXiv.

[6]  Geoff Holmes,et al.  Multi-label Classification Using Ensembles of Pruned Sets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[7]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[8]  Thomas Eisenbarth,et al.  Simpler, Faster, and More Robust T-Test Based Leakage Detection , 2016, COSADE.

[9]  Fei Xu,et al.  Knowledge graph construction with structure and parameter learning for indoor scene design , 2018, Computational Visual Media.

[10]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[11]  Jun Zhao,et al.  Knowledge Graph Embedding via Dynamic Mapping Matrix , 2015, ACL.

[12]  Ricard V. Solé,et al.  Least effort and the origins of scaling in human language , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[14]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[15]  Ji Li,et al.  Softmax Regression Design for Stochastic Computing Based Deep Convolutional Neural Networks , 2017, ACM Great Lakes Symposium on VLSI.

[16]  Katherine R B Jankowski,et al.  The t-test: An Influential Inferential Tool in Chaplaincy and Other Healthcare Research , 2018, Journal of health care chaplaincy.

[17]  Minyi Guo,et al.  TransT: Type-Based Multiple Embedding Representations for Knowledge Graph Completion , 2017, ECML/PKDD.

[18]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Entity Descriptions , 2016, AAAI.

[19]  Jakob Grue Simonsen,et al.  Power Law Distributions in Information Retrieval , 2016, ACM Trans. Inf. Syst..

[20]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[21]  Qiang Zhou,et al.  Leveraging Conceptualization for Short-Text Embedding , 2018, IEEE Transactions on Knowledge and Data Engineering.

[22]  Zhaowei Shang,et al.  Negative samples reduction in cross-company software defects prediction , 2015, Inf. Softw. Technol..

[23]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[24]  Seyed Mehran Kazemi,et al.  SimplE Embedding for Link Prediction in Knowledge Graphs , 2018, NeurIPS.

[25]  Han Xiao,et al.  TransG : A Generative Model for Knowledge Graph Embedding , 2015, ACL.

[26]  K. Dill,et al.  A maximum entropy framework for nonexponential distributions , 2013, Proceedings of the National Academy of Sciences.

[27]  Cheng Long,et al.  Profit Maximization with Sufficient Customer Satisfactions , 2018, ACM Trans. Knowl. Discov. Data.

[28]  Hans-Peter Kriegel,et al.  A Three-Way Model for Collective Learning on Multi-Relational Data , 2011, ICML.

[29]  Qiang Zhou,et al.  CSE: Conceptual Sentence Embeddings based on Attention Model , 2016, ACL.

[30]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[31]  Heyan Huang,et al.  Query Expansion Based on a Feedback Concept Model for Microblog Retrieval , 2017, WWW.

[32]  Jason Weston,et al.  Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[33]  Jianli Li,et al.  Initial fine alignment based on self-contained measurement in erection manoeuvre , 2018 .

[34]  Nicola Ferro,et al.  The twist measure for IR evaluation: Taking user's effort into account , 2015, J. Assoc. Inf. Sci. Technol..

[35]  M. Slatkin,et al.  Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. , 1991, Genetics.

[36]  Gerard Kempen,et al.  Incremental syntactic tree formation in human sentence processing: A cognitive architecture based on activation decay and simulated annealing , 1989 .

[37]  Juan-Zi Li,et al.  Text-Enhanced Representation Learning for Knowledge Graph , 2016, IJCAI.

[38]  A. Cutler,et al.  Malapropisms and the structure of the mental lexicon , 1977 .

[39]  Minlie Huang,et al.  SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions , 2016, AAAI.

[40]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[41]  Yu Xue,et al.  Text classification based on deep belief network and softmax regression , 2016, Neural Computing and Applications.

[42]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[43]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[44]  Lorenzo Rosasco,et al.  Holographic Embeddings of Knowledge Graphs , 2015, AAAI.

[45]  Jason Weston,et al.  Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[46]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[47]  Guihua Wen,et al.  Ensemble softmax regression model for speech emotion recognition , 2017, Multimedia Tools and Applications.

[48]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[49]  Siu Cheung Hui,et al.  Non-Parametric Estimation of Multiple Embeddings for Link Prediction on Dynamic Knowledge Graphs , 2017, AAAI.

[50]  Lizhen Qu,et al.  STransE: a novel embedding model of entities and relationships in knowledge bases , 2016, NAACL.