Automated discovery of product preferences in ubiquitous social media data: A case study of automobile market

Social media enables ubiquitous communication that allows users to disseminate and receive information anywhere and anytime. Among this increasingly vast pool of social media data reside opinionate messages that infer user experience on product usages. Knowledge extracted from such messages could prove to be useful to manufacturers and designers looking to develop next generation products that better meet the needs of the market. Recent developments in machine learning algorithms make it possible to analyze and automatically discover patterns existing within large scale social media networks. Though previous literature has shown that it is possible to extract customers' preferences on smartphones from Twitter data, doubts arise as whether the proposed algorithms could generalize to other product domains. In this paper, we illustrate that the methodology proposed in the previous literature could also be applied on automobile products, whose user-generated content in social media is quite limited, compared to more main stream products such as smartphones.

[1]  Conrad S. Tucker,et al.  Automated Discovery of Lead Users and Latent Product Features by Mining Large Scale Social Media Networks , 2015 .

[2]  Arvid Kappas,et al.  Sentiment in short strength detection informal text , 2010, J. Assoc. Inf. Sci. Technol..

[3]  Alok N. Choudhary,et al.  Voice of the Customers: Mining Online Customer Reviews for Product Feature-based Ranking , 2010, WOSN.

[4]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[5]  C. Lee Giles,et al.  Building a Search Engine for Algorithms , 2014 .

[6]  Son Doan,et al.  Syndromic Classification of Twitter Messages , 2011, eHealth.

[7]  Conrad S. Tucker,et al.  Quantifying Product Favorability and Extracting Notable Product Features Using Large Scale Social Media Data , 2015, J. Comput. Inf. Sci. Eng..

[8]  Conrad S. Tucker Fad or Here to Stay: Predicting Product Market Adoption and Longevity Using Large Scale, Social Media Data DETC2013-12661 , 2013 .

[9]  Conrad S. Tucker,et al.  Discovering Next Generation Product Innovations by Identifying Lead User Preferences Expressed Through Large Scale Social Media Data , 2014 .

[10]  John Yen,et al.  Classifying text messages for the haiti earthquake , 2011, ISCRAM.

[11]  Yiqun Liu,et al.  Discover breaking events with popular hashtags in twitter , 2012, CIKM.

[12]  Conrad S. Tucker,et al.  Trend Mining for Predictive Product Design , 2011 .

[13]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[14]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[15]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[16]  Conrad S. Tucker,et al.  TwittDict: Extracting Social Oriented Keyphrase Semantics from Twitter , 2015, Proceedings of the ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction.

[17]  Suppawong Tuarob,et al.  Improving pseudo-code detection in ubiquitous scholarly data using ensemble machine learning , 2016, 2016 International Computer Science and Engineering Conference (ICSEC).

[18]  C. Lee Giles,et al.  Automatic tag recommendation for metadata annotation using probabilistic topic modeling , 2013, JCDL '13.

[19]  Sechan Oh,et al.  Automatic Discovery of Service Name Replacements Using Ledger Data , 2015, 2015 IEEE International Conference on Services Computing.

[20]  Prasenjit Mitra,et al.  An algorithm search engine for software developers , 2011, SUITE '11.

[21]  E. Fox Emotion Science: Cognitive and Neuroscientific Approaches to Understanding Human Emotions , 2008 .

[22]  C. Lee Giles,et al.  A generalized topic modeling approach for automatic document annotation , 2015, International Journal on Digital Libraries.

[23]  A. Kaplan,et al.  Users of the world, unite! The challenges and opportunities of Social Media , 2010 .

[24]  C. Lee Giles,et al.  NMRexSeer: Metadata extraction and search for large scale Nuclear Magnetic Resonance (NMR) experimental data , 2015, 2015 International Computer Science and Engineering Conference (ICSEC).

[25]  C. Lee Giles,et al.  Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[26]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[27]  Wenyi Huang,et al.  Towards building a scholarly big data platform: Challenges, lessons and opportunities , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[28]  Cornelia Caragea,et al.  CiteSeerX: AI in a Digital Library Search Engine , 2014, AI Mag..

[29]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[30]  Arjun Mukherjee,et al.  Exploiting Burstiness in Reviews for Review Spammer Detection , 2021, ICWSM.

[31]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[32]  Bruno S. Silvestre,et al.  Social Media? Get Serious! Understanding the Functional Building Blocks of Social Media , 2011 .

[33]  C. Lee Giles,et al.  A hybrid approach to discover semantic hierarchical sections in scholarly documents , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[34]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[35]  Prasenjit Mitra,et al.  AlgorithmSeer: A System for Extracting and Searching for Algorithms in Scholarly Big Data , 2016, IEEE Transactions on Big Data.

[36]  Conrad S. Tucker,et al.  A Product Feature Inference Model for Mining Implicit Customer Preferences Within Large Scale Social Media Networks , 2015 .

[37]  Marcel Salathé,et al.  An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages , 2014, J. Biomed. Informatics.

[38]  Cornelia Caragea,et al.  PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search , 2015, K-CAP.

[39]  Marcel Salathé,et al.  Modeling Individual-Level Infection Dynamics Using Social Network Information , 2015, CIKM.

[40]  Wolfgang Jank,et al.  Understanding Geographical Markets of Online Firms Using Spatial Models of Customer Choice , 2005 .