Automatically Quantifying Customer Need Tweets: Towards a Supervised Machine Learning Approach

The elicitation of customer needs is an important task for businesses in order to design customer-centric products and services. While there are different approaches available, most lack automation, scalability and monitoring capabilities. In this work, we demonstrate the feasibility to automatically identify and quantify customer needs by training and evaluating on previously-labeled Twitter data. To achieve that, we utilize a supervised machine learning approach. Our results show that the classification performances are statistically superior—but can be further improved in the future.

[1]  Martin Porter,et al.  Snowball: A language for stemming algorithms , 2001 .

[2]  David Limehouse,et al.  Know your customer , 1999 .

[3]  Gerhard Satzger,et al.  Needmining: identifying micro Blog Data containing Customer Needs , 2020, ECIS.

[4]  Christian Engel,et al.  Holistically Defining E-Mobility: A Modern Approach to Systematic Literature Reviews , 2015 .

[5]  F. Misopoulos,et al.  Uncovering customer service experiences with Twitter: the case of airline industry , 2014 .

[6]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[7]  Bruno S. Silvestre,et al.  Social Media? Get Serious! Understanding the Functional Building Blocks of Social Media , 2011 .

[8]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[9]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[10]  S. Chatterjee,et al.  Design Science Research in Information Systems , 2010 .

[11]  P. Kotler,et al.  Principles of Marketing , 1983 .

[12]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[13]  Abdul Rahman Omar,et al.  An intelligent information framework relating customer requirements and product characteristics , 2001 .

[14]  Sang-goo Lee,et al.  Opinion mining of customer feedback data on the web , 2008, ICUIMC '08.

[15]  Tuure Tuunanen,et al.  Design Science Research Evaluation , 2012, DESRIST.

[16]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[17]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[18]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[19]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[20]  Brad Wardman,et al.  Voice of the customer , 2013, 2013 APWG eCrime Researchers Summit.

[21]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[22]  Petra Nieken,et al.  Hidden Benefits of Reward: A Field Experiment on Motivation and Monetary Incentives , 2013, SSRN Electronic Journal.

[23]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[24]  Hsinchun Chen,et al.  AI and Opinion Mining , 2010, IEEE Intelligent Systems.

[25]  Albert Bifet,et al.  Sentiment Knowledge Discovery in Twitter Streaming Data , 2010, Discovery Science.

[26]  Alan R. Hevner,et al.  POSITIONING AND PRESENTING DESIGN SCIENCE RESEARCH FOR MAXIMUM IMPACT 1 , 2013 .

[27]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[28]  Alan R. Hevner,et al.  Design Science Research in Design Science Research in Information Systems , 2011 .

[29]  Richard W. Cuthbertson,et al.  Innovating in a Service-Driven Economy: Insights, Application, and Practice , 2014 .

[30]  Edward A. Fox,et al.  Recent Developments in Document Clustering , 2007 .

[31]  Marc Goutier,et al.  Needmining: Evaluating a Whitelist-Based Assignment Method to Quantify Customer Needs from Micro Blog Data , 2016, OR.

[32]  A. Smeaton,et al.  On Using Twitter to Monitor Political Sentiment and Predict Election Results , 2011 .

[33]  Kennon M. Sheldon,et al.  What is satisfying about satisfying events? Testing 10 candidate psychological needs. , 2001, Journal of personality and social psychology.

[34]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[35]  Bernardete Ribeiro,et al.  The importance of stop word removal on recall values in text categorization , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[36]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[37]  Fernando Nogueira,et al.  Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning , 2016, J. Mach. Learn. Res..

[38]  Johnny Saldaña,et al.  The Coding Manual for Qualitative Researchers , 2009 .

[39]  Vimala Balakrishnan,et al.  Stemming and lemmatization: A comparison of retrieval performances , 2014 .

[40]  Geoffrey I. Webb,et al.  # 2001 Kluwer Academic Publishers. Printed in the Netherlands. Machine Learning for User Modeling , 1999 .

[41]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[42]  Gerhard Satzger,et al.  An end-to-end process model for supervised machine learning classification: from problem to deployment in information systems , 2017 .

[43]  Mehran Sahami,et al.  Text Mining: Classification, Clustering, and Applications , 2009 .

[44]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[45]  D. Maynard,et al.  Challenges in developing opinion mining tools for social media , 2012 .

[46]  Andrew T. Perrin Social Media Usage: 2005-2015 , 2015 .

[47]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[48]  Ray Chen,et al.  Analysis of Twitter Feeds for the Prediction of Stock Market Movement , 2011 .