Modeling Joint Representation with Tri-Modal Deep Belief Networks for Query and Question Matching

One of the main research tasks in community question answering (cQA) is finding the most relevant questions for a given new query, thereby providing useful knowledge for users. The straightforward approach is to capitalize on textual features, or a bag-of-words (BoW) representation, to conduct the matching process between queries and questions. However, these approaches have a lexical gap issue which means that, if lexicon matching fails, they cannot model the semantic meaning. In addition, latent semantic models, like latent semantic analysis (LSA), attempt to map queries to its corresponding semantically similar questions through a lower dimension representation. But alas, LSA is a shallow and linear model that cannot model highly non-linear correlations in cQA. Moreover, both BoW and semantic oriented solutions utilize a single dictionary to represent the query, question, and answer in the same feature space. However, the correlations between them, as we observe from data, imply that they lie in entirely different feature spaces. In light of these observations, this paper proposes a tri-modal deep belief network (tri-DBN) to extract a unified representation for the query, question, and answer, with the hypothesis that they locate in three different feature spaces. Besides, we compare the unified representation extracted by our model with other representations using the Yahoo! Answers queries on the dataset. Finally, Experimental results reveal that the proposed model captures semantic meaning both within and between queries, questions, and answers. In addition, the results also suggest that the joint representation extracted via the proposed method can improve the performance of cQA archives searching. key words: cQA, deep belief networks, joint representation, tri-modal deep belief network

[1]  Christian Igel,et al.  Training restricted Boltzmann machines: An introduction , 2014, Pattern Recognit..

[2]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[3]  Tapani Raiko,et al.  Gaussian-Bernoulli deep Boltzmann machine , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[4]  Günter Neumann,et al.  Learning to Rank Effective Paraphrases from Query Logs for Community Question Answering , 2013, AAAI.

[5]  Yoshua Bengio,et al.  Joint Training of Deep Boltzmann Machines , 2012, ArXiv.

[6]  Seung-won Hwang,et al.  An efficient method for learning nonlinear ranking SVM functions , 2012, Inf. Sci..

[7]  Christian Igel,et al.  An Introduction to Restricted Boltzmann Machines , 2012, CIARP.

[8]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[9]  Edward Y. Chang,et al.  K2Q: Generating Natural Language Questions from Keywords with User Refinements , 2011, IJCNLP.

[10]  Chao Li,et al.  Automatically Generating Questions from Queries for Community-based Question Answering , 2011, IJCNLP.

[11]  Xuanjing Huang,et al.  Efficient Near-Duplicate Detection for Q&A Forum , 2011, IJCNLP.

[12]  Jeffrey Pennington,et al.  Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[13]  Li Cai,et al.  Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives , 2011, ACL.

[14]  Mihai Surdeanu,et al.  Learning to Rank Answers to Non-Factoid Questions from Web Collections , 2011, CL.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Jaime G. Carbonell,et al.  Rank learning for factoid question answering with linguistic and semantic constraints , 2010, CIKM.

[17]  Michael R. Lyu,et al.  Diversifying Query Suggestion Results , 2010, AAAI.

[18]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[19]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[20]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[21]  Yong Yu,et al.  Understanding and Summarizing Answers in Community-Based Question Answering Services , 2008, COLING.

[22]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[23]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[24]  James P. Callan,et al.  Structured retrieval for question answering , 2007, SIGIR.

[25]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[26]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[27]  Peter Wiemer-Hastings,et al.  Latent semantic analysis , 2004, Annu. Rev. Inf. Sci. Technol..

[28]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[29]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[30]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[31]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[32]  M. E. Muller,et al.  A Note on the Generation of Random Normal Deviates , 1958 .