Extracting Business Intelligence from Online Product Reviews: An Experiment of Automatic Rule-Induction

Online product reviews are a major source of business intelligence (BI) that helps managers and market researchers make important decisions on product development and promotion. However, the large volume of online product review data creates significant information overload problems, making it difficult to analyze users’ concerns. In this paper, we employ a design science paradigm to develop a new framework for designing BI systems that correlate the textual content and the numerical ratings of online product reviews. Based on the framework, we developed a prototype for extracting the relationship between the user ratings and their textual comments posted on Amazon.com’s Web site. Two data mining algorithms were implemented to extract automatically decision rules that guide the understanding of the relationship. We report on experimental results of using the prototype to extract rules from online reviews of three products and discuss the managerial implications.

[1]  George Lawton Making Business Intelligence More Useful , 2006, Computer.

[2]  Weiss,et al.  Text Mining , 2010 .

[3]  Bing Liu,et al.  Analyzing and Detecting Review Spam , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[4]  Hsinchun Chen,et al.  An automatic text mining framework for knowledge discovery on the web , 2004 .

[5]  Jay F. Nunamaker,et al.  A Visual Framework for Knowledge Discovery on the Web: An Empirical Study of Business Intelligence Exploration , 2005, J. Manag. Inf. Syst..

[6]  Hsinchun Chen,et al.  Business stakeholder analyzer: An experiment of classifying stakeholders on the Web , 2009, J. Assoc. Inf. Sci. Technol..

[7]  Dorothy E. Leidner,et al.  Review: Knowledge Management and Knowledge Management Systems: Conceptual Foundations and Research Issues , 2001, MIS Q..

[8]  John K. Debenham,et al.  Informed Recommender: Basing Recommendations on Consumer Product Reviews , 2007, IEEE Intelligent Systems.

[9]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[10]  Jiawei Han,et al.  Proceedings of the Second International Conference on Knowledge Discovery and Data Mining , 1996 .

[11]  Jan G. Bazan,et al.  Rough set algorithms in classification problem , 2000 .

[12]  Peter B. Danzig,et al.  Scalable Internet resource discovery: research problems and approaches , 1994, CACM.

[13]  Jerzy W. Grzymala-Busse,et al.  Rough Sets , 1995, Commun. ACM.

[14]  Alan R. Hevner,et al.  Design Science in Information Systems Research , 2004, MIS Q..

[15]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Jerzy W. Grzymala-Busse,et al.  A New Version of the Rule Induction System LERS , 1997, Fundam. Informaticae.

[17]  Zhu Zhang Weighing Stars: Aggregating Online Product Reviews for Intelligent E-commerce Applications , 2008, IEEE Intelligent Systems.

[18]  Mohammed J. Zaki,et al.  Large-Scale Parallel Data Mining , 2002, Lecture Notes in Computer Science.

[19]  Hung Son Nguyen,et al.  Analysis of STULONG Data by Rough Set Exploration System (RSES) , 2003 .

[20]  Hsinchun Chen,et al.  A Lexicon-Enhanced Method for Sentiment Classification: An Experiment on Online Product Reviews , 2010, IEEE Intelligent Systems.

[21]  Andrew Kusiak,et al.  Autonomous decision-making: a data mining approach , 2000, IEEE Transactions on Information Technology in Biomedicine.

[22]  Xiaoyan Zhu,et al.  Movie review mining and summarization , 2006, CIKM '06.

[23]  Magdalene Marinaki,et al.  An evolutionary approach to construction of outranking models for multicriteria classification: The case of the ELECTRE TRI method , 2009, Eur. J. Oper. Res..

[24]  Gilles Pesant,et al.  Distributed search for supply chain coordination , 2009, Comput. Ind..

[25]  Bing Liu,et al.  The utility of linguistic rules in opinion mining , 2007, SIGIR.

[26]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[27]  Hajo Hippner,et al.  Text Mining , 2006, Informatik-Spektrum.

[28]  Gang Wang,et al.  Crime data mining: a general framework and some examples , 2004, Computer.

[29]  Yong Shi,et al.  A rough set-based multiple criteria linear programming approach for the medical diagnosis and prognosis , 2009, Expert Syst. Appl..

[30]  Shusaku Tsumoto,et al.  Accuracy and Coverage in Rough Set Rule Induction , 2002, Rough Sets and Current Trends in Computing.

[31]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[32]  Heikki Mannila,et al.  Proceedings of the Third International Conference on Knowledge Discovery and Data Mining , 1997 .

[33]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[34]  Rodrigo Baroni de Carvalho,et al.  Using information technology to support knowledge conversion processes , 2001, Inf. Res..