Mining Open Answers in Questionnaire Data

Surveys are important tools for marketing and for managing customer relationships; the answers to open-ended questions, in particular, often contain valuable information and provide an important basis for business decisions. The summaries that human analysts make of these open answers, however, tend to rely too much on intuition and so aren't satisfactorily reliable. Moreover, because the Web makes it so easy to take surveys and solicit comments, companies are finding themselves inundated with data from questionnaires and other sources. Handling it all manually would be not only cumbersome but also costly. Thus, devising a computer system that can automatically mine useful information from open answers has become an important issue. We have developed a survey analysis system that works on these principles. The system mines open answers through two statistical learning techniques: rule learning (which we call rule analysis) and correspondence analysis.

[1]  James Allan,et al.  Extracting significant time varying features from text , 1999, CIKM '99.

[2]  Jorma Rissanen,et al.  Stochastic Complexity in Learning , 1995, J. Comput. Syst. Sci..

[3]  Kenji Yamanishi,et al.  A Learning Criterion for Stochastic Rules , 2004, Machine Learning.

[4]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[5]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[6]  Hang Li,et al.  Mining from open answers in questionnaire data , 2001, KDD '01.

[7]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[8]  Topic Analysis Using a Finite Mixture Model , 2000, EMNLP.

[9]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[10]  Ramakrishnan Srikant,et al.  Discovering Trends in Text Databases , 1997, KDD.

[11]  Jorma Rissanen,et al.  Fisher information and stochastic complexity , 1996, IEEE Trans. Inf. Theory.

[12]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[13]  A. D. Gordon,et al.  Correspondence Analysis Handbook. , 1993 .

[14]  Mark Wasson,et al.  Monitoring a newsfeed for hot topics , 1999, KDD '99.

[15]  Hang Li,et al.  Text classification using ESC-based stochastic decision lists , 1999, CIKM '99.

[16]  Kenji Yamanishi,et al.  A Decision-Theoretic Extension of Stochastic Complexity and Its Applications to Learning , 1998, IEEE Trans. Inf. Theory.

[17]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .