QuestionHolic: Hot topic discovery and trend analysis in community question answering systems

Community question answering (CQA) has recently become a popular social media where users can post questions on any topic of interest and get answers from enthusiasts. The variation of topics in questions and answers indicate the change of users' interests over time. It can help users focus on the most popular products or events and track their changes by exploiting hot topics and analyzing the trend of a specific topic. In this paper, we present a hot topic detection and trend analysis system to capture hot topics in a CQA system and track their evolutions over time. Our system consists of hot term extraction, question clustering and trend analysis. Experimental results using datasets from Yahoo! Answers show that our system can discover meaningful hot topics. We also show that the evolution of topics over time can be accurately exploited by trend graphing. (C) 2010 Elsevier Ltd. All rights reserved.

[1]  Zhang Chengzhi,et al.  Topic Navigation Generation Using Topic Extraction and Clustering , 2008, 2008 International Symposium on Knowledge Acquisition and Modeling.

[2]  Takahiro Kawamura,et al.  Ontology-Based Topic Extraction Service from Weblogs , 2008, 2008 IEEE International Conference on Semantic Computing.

[3]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[4]  Matthew Hurst,et al.  BlogPulse: Automated Trend Discovery for Weblogs , 2003 .

[5]  Eugene Agichtein,et al.  Finding the right facts in the crowd: factoid question answering over social media , 2008, WWW.

[6]  Mitsuru Ishizuka,et al.  Topic extraction from news archive using TF*PDF algorithm , 2002, Proceedings of the Third International Conference on Web Information Systems Engineering, 2002. WISE 2002..

[7]  Yong Yu,et al.  Searching Questions by Identifying Question Topic and Question Focus , 2008, ACL.

[8]  Ah-Hwee Tan,et al.  Topic Detection, Tracking, and Trend Analysis Using Self-Organizing Neural Networks , 2001, PAKDD.

[9]  Kuan-Yu Chen,et al.  Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling , 2007, IEEE Transactions on Knowledge and Data Engineering.

[10]  Lipika Dey,et al.  Document Clustering for Event Identification and Trend Analysis in Market News , 2009, 2009 Seventh International Conference on Advances in Pattern Recognition.

[11]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[12]  Eugene Agichtein,et al.  Discovering authorities in question answer communities by using link analysis , 2007, CIKM '07.

[13]  Dawid Weiss,et al.  A concept-driven algorithm for clustering search results , 2005, IEEE Intelligent Systems.

[14]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[15]  Yuichiro Sekiguchi,et al.  Topic Detection from Blog Documents Using Users’ Interests , 2006, 7th International Conference on Mobile Data Management (MDM'06).

[16]  Yong Yu,et al.  Recommending questions using the mdl-based tree cut model , 2008, WWW.

[17]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.