Mining FAQ from forum threads: theoretical framework

Frequently Asked Questions (FAQ)'s tag is becoming more popular on websites. Research activities have been concentrated on its retrieval rather than construction. FAQ construction can be achieved using a number of sources. Presently, it is mostly done manually by help desk staff and this tends to make it static in nature. In this paper, a comprehensive review of various components that can guarantee effective mining of FAQ from forum threads is presented. The components encompass pre-processing, mining of questions, mining of answers and mining of the FAQ. Besides the general idea and concept, we discuss the strengths and limitations of the various techniques used in these components. In fact, the following questions are addressed in the review. What kind of pre-processing technique is needed for mining FAQ from forum? What are the recent techniques for mining questions from forum threads? What approaches are currently dominating answer retrieval from forum threads? How can we cluster out FAQ from question and answer database?.

[1]  W. Bruce Croft,et al.  Online community search using conversational structures , 2011, Information Retrieval.

[2]  Hewijin Christine Jiau,et al.  A FAQ Finding Process in Open Source Project Forums , 2010, 2010 Fifth International Conference on Software Engineering Advances.

[3]  M. Akiyoshi,et al.  A help desk support system with filtering and reusing e-mails , 2010, 2010 8th IEEE International Conference on Industrial Informatics.

[4]  Jungyun Seo,et al.  A reliable FAQ retrieval system using a query log classification technique based on latent semantic analysis , 2007, Inf. Process. Manag..

[5]  ChengXiang Zhai,et al.  Exploiting Forum Thread Structures to Improve Thread Clustering , 2013, ICTIR.

[6]  Juan Luis Castro,et al.  Learning regular expressions to template-based FAQ retrieval systems , 2013, Knowl. Based Syst..

[7]  Jian Zhu,et al.  FAQ Auto Constructing Based on Clustering , 2012, 2012 International Conference on Computer Science and Electronics Engineering.

[8]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[9]  Lin Sun,et al.  Extracting Chinese question-answer pairs from online forums , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[10]  Jungyun Seo,et al.  Cluster-Based FAQ Retrieval Using Latent Term Weights , 2008, IEEE Intelligent Systems.

[11]  Prasenjit Mitra,et al.  Classifying User Messages For Managing Web Forum Data , 2012, WebDB.

[12]  Hui Fang,et al.  A Re-examination of Query Expansion Using Lexical Resources , 2008, ACL.

[13]  Juan Luis Castro,et al.  FAQtory: A framework to provide high-quality FAQ retrieval systems , 2012, Expert Syst. Appl..

[14]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[15]  Preethi Raghavan,et al.  Extracting Problem and Resolution Information from Online Discussion Forums , 2010, COMAD.

[16]  Eric Brill,et al.  Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users , 2004, EMNLP.

[17]  L. Venkata Subramaniam,et al.  SMS based Interface for FAQ Retrieval , 2009, ACL.

[18]  Alexander Löser,et al.  Detecting Near-Duplicate Relations in User Generated Forum Content , 2010, OTM Workshops.

[19]  Min Feng,et al.  Question Similarity Calculation for FAQ Answering , 2007, Third International Conference on Semantics, Knowledge and Grid (SKG 2007).

[20]  Chuan-Jie Lin,et al.  Question Pre-Processing in a QA System on Internet Discussion Groups , 2006 .

[21]  Charu C. Aggarwal,et al.  Data Clustering: Algorithms and Applications , 2014 .

[22]  Yong Yu,et al.  Analyzing and Predicting Not-Answered Questions in Community-based Question Answering Services , 2011, AAAI.

[23]  Lin Sun,et al.  Thread Segmentation Based Answer Detection in Chinese Online Forums , 2013 .

[24]  Rung Ching Chen,et al.  Using Domain Ontology to Implement a Frequently Asked Questions system , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[25]  Lin Sun,et al.  A study of features on Primary Question detection in Chinese online forums , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[26]  Mira Mezini,et al.  Semi-automatically extracting FAQs to improve accessibility of software development knowledge , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[27]  Neil Yorke-Smith,et al.  Detection of Imperative and Declarative Question-Answer Pairs in Email Conversations , 2009, IJCAI.

[28]  Brian D. Davison,et al.  A classification-based approach to question answering in discussion boards , 2009, SIGIR.

[29]  Juan Luis Castro,et al.  A high-performance FAQ retrieval method using minimal differentiator expressions , 2012, Knowl. Based Syst..

[30]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.