Analyzing Self-Driving Cars on Twitter

This paper studies users' perception regarding a controversial product, namely self-driving (autonomous) cars. To find people's opinion regarding this new technology, we used an annotated Twitter dataset, and extracted the topics in positive and negative tweets using an unsupervised, probabilistic model known as topic modeling. We later used the topics, as well as linguist and Twitter specific features to classify the sentiment of the tweets. Regarding the opinions, the result of our analysis shows that people are optimistic and excited about the future technology, but at the same time they find it dangerous and not reliable. For the classification task, we found Twitter specific features, such as hashtags as well as linguistic features such as emphatic words among top attributes in classifying the sentiment of the tweets.

[1]  Jana Diesner,et al.  Telling Apart Tweets Associated with Controversial versus Non-Controversial Topics , 2017, NLP+CSS@ACL.

[2]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[3]  Murali S. Kodialam,et al.  Joint scheduling of processing and Shuffle phases in MapReduce systems , 2012, 2012 Proceedings IEEE INFOCOM.

[4]  Joseph B. Lyons,et al.  Human–Human Reliance in the Context of Automation , 2012, Hum. Factors.

[5]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[6]  Janyce Wiebe,et al.  Computing Attitude and Affect in Text: Theory and Applications , 2005, The Information Retrieval Series.

[7]  Chunping Li,et al.  Ontology Based Opinion Mining for Movie Reviews , 2009, KSEM.

[8]  Yi Lu,et al.  Priority algorithm for near-data scheduling: Throughput and heavy-traffic optimality , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[9]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[10]  Andrew V. Goldberg,et al.  Quincy: fair scheduling for distributed computing clusters , 2009, SOSP '09.

[11]  Tom White Hadoop - The Definitive Guide: MapReduce for the Cloud , 2009 .

[12]  Fang Dong,et al.  BAR: An Efficient Data Locality Driven Task Scheduling Algorithm for Cloud Computing , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[13]  Guodong Zhou,et al.  What reviews are satisfactory: novel features for automatic helpfulness voting , 2012, SIGIR '12.

[14]  Minghong Lin,et al.  Joint optimization of overlapping phases in MapReduce , 2013, PERV.

[15]  Ari Rappoport,et al.  What's in a hashtag?: content based prediction of the spread of ideas in microblogging communities , 2012, WSDM '12.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[18]  Lei Zhang,et al.  A Survey of Opinion Mining and Sentiment Analysis , 2012, Mining Text Data.

[19]  Mohammad Hassan Hajiesmaili,et al.  GB-PANDAS:: Throughput and heavy-traffic optimality analysis for affinity scheduling , 2018, PERV.

[20]  Johanna D. Moore,et al.  Twitter Sentiment Analysis: The Good the Bad and the OMG! , 2011, ICWSM.

[21]  Peter J. Denning,et al.  Operating Systems Principles for Data Flow Networks , 1978, Computer.

[22]  Chen He,et al.  Matchmaking: A New MapReduce Scheduling Technique , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[23]  Lei Ying,et al.  MapTask Scheduling in MapReduce With Data Locality: Throughput and Heavy-Traffic Optimality , 2013, IEEE/ACM Transactions on Networking.

[24]  Xiaoqiao Meng,et al.  Coupling task progress for MapReduce resource-aware scheduling , 2013, 2013 Proceedings IEEE INFOCOM.

[25]  Hai Jin,et al.  Maestro: Replica-Aware Map Scheduling for MapReduce , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[26]  Jana Diesner,et al.  Identifying the Overlap between Election Result and Candidates’ Ranking Based on Hashtag-Enhanced, Lexicon-Based Sentiment Analysis , 2017, 2017 IEEE 11th International Conference on Semantic Computing (ICSC).

[27]  Ali Yekkehkhany Near Data Scheduling for Data Centers with Multi Levels of Data Locality , 2017, ArXiv.

[28]  A. Stolyar MaxWeight scheduling in a generalized switch: State space collapse and workload minimization in heavy traffic , 2004 .

[29]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[30]  Tom M. Mitchell,et al.  What can machine learning do? Workforce implications , 2017, Science.

[31]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[32]  Yi Lu,et al.  Scheduling with multi-level data locality: Throughput and heavy-traffic optimality , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.