Unsupervised deep semantic and logical analysis for identification of solution posts from community answers

These days' discussion forums provide dependable solutions to the problems related to multiple domains and areas. However, due to the presence of huge amount of less-informative/inappropriate posts, the identification of the appropriate problem-solution pairs has become a challenging task. The emergence of a variety of topics, domains and areas has made the task of manual labelling of the problem solution-post pairs a very costly and time consuming task. To solve these issues, we concentrate on deep semantic and logical relation between terms. For this, we introduce a novel semantic correlation graph to represent the text. The proposed representation helps us in the identification of topical and semantic relation between terms at a fine grain level. Next, we apply the improved version of personalised pagerank using random walk with restarts. The main aim is to improve the rank score of terms having direct or indirect relation with terms in the given question. Finally, we introduce the use of the node overlapping version of GAAC to find the actual span of answer text. Our experimental results show that the devised system performs better than the existing unsupervised systems.

[1]  K. Srinathan,et al.  A Knowledge Induced Graph-Theoretical Model for Extract and Abstract Single Document Summarization , 2013, CICLing.

[2]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[3]  Karthik Visweswariah,et al.  Semi-Supervised Answer Extraction from Discussion Forums , 2013, IJCNLP.

[4]  Xiaoyan Zhu,et al.  Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums , 2008, ACL.

[5]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[6]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Janet L. Kolodner,et al.  An introduction to case-based reasoning , 1992, Artificial Intelligence Review.

[8]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[9]  Marco Gori,et al.  ItemRank: A Random-Walk Based Scoring Algorithm for Recommender Engines , 2007, IJCAI.

[10]  Brian D. Davison,et al.  A classification-based approach to question answering in discussion boards , 2009, SIGIR.

[11]  Yasuhiro Fujiwara,et al.  Fast and Exact Top-k Search for Random Walk with Restart , 2012, Proc. VLDB Endow..

[12]  Li Wang,et al.  Tagging and Linking Web Forum Posts , 2010, CoNLL.

[13]  R. Mooney,et al.  Learnable similarity functions and their application to record linkage and clustering , 2006 .

[14]  Le Zhao,et al.  Automatic term mismatch diagnosis for selective query expansion , 2012, SIGIR '12.

[15]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[16]  Dan Feng,et al.  Ranking community answers by modeling question-answer relationships via analogical reasoning , 2009, SIGIR.

[17]  Yang Liu,et al.  Finding Problem Solving Threads in Online Forum , 2011, IJCNLP.

[18]  Sepandar D. Kamvar,et al.  An Analytical Comparison of Approaches to Personalizing PageRank , 2003 .

[19]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[20]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[21]  Huan Liu,et al.  Efficient Hierarchical Clustering Algorithms Using Partially Overlapping Partitions , 2001, PAKDD.

[22]  K. Srinathan,et al.  Exploiting N-gram Importance and Wikipedia based Additional Knowledge for Improvements in GAAC based Document Clustering , 2010, KDIR.

[23]  Karthik Visweswariah,et al.  Unsupervised Solution Post Identification from Discussion Forums , 2014, ACL.

[24]  Sangkeun Lee,et al.  Random walk based entity ranking on graph for multidimensional recommendation , 2011, RecSys '11.

[25]  Padhraic Smyth,et al.  Algorithms for estimating relative importance in networks , 2003, KDD '03.

[26]  Vasudeva Varma,et al.  Exploring the Role of Logically Related Non-Question Phrases for Answering Why-Questions , 2013, ArXiv.