Unveiling Topics from Scientific Literature on the Subject of Self-driving Cars using Latent Dirichlet Allocation

Self-driving cars are becoming popular topics in academia. Consumers of self-driving cars and vehicles have different concerns, for example, safety and security, to name a few. Also, the public sector has interests in self-driving cars such as amending policies to enable the management of self-driving vehicles in cities, urban planning, traffic management and, etc. In this paper, more than 2700 corpus are extracted from literature from several subject areas to identify latent (hidden) topics of self-driving cars. Latent Dirichlet Allocation (LDA) is used for topic identification. The result of this study shows that topics identified are valid research areas such as urban planning, driver car (computer) interaction, self-driving control and system design, ethics in self-driving cars, safety and risk assessment, training dataset quality and machine learning in self-driving cars are among the topics identified. Furthermore, the network visualization of association graph of terms shows that the most frequently discussed concepts reveal that control of self-driving cars is based on algorithms, data, design, method, and model. The methods used in this study and the results can be used as decision tools, if carefully applied, in diverse disciplines that are disrupted by the introduction of self-driving cars. For future study, we plan to extend this study with a larger dataset and other data mining techniques.

[1]  W. Marsden I and J , 2012 .

[2]  Ralf Krestel,et al.  Latent dirichlet allocation for tag recommendation , 2009, RecSys '09.

[3]  Yang Wang,et al.  Semi-Latent Dirichlet Allocation: A Hierarchical Model for Human Action Recognition , 2007, Workshop on Human Motion.

[4]  Chiranjeev Kohli,et al.  Will social media kill branding , 2015 .

[5]  Jácint Szabó,et al.  Linked latent Dirichlet allocation in web spam filtering , 2009, AIRWeb '09.

[6]  David B. Dunson,et al.  Probabilistic topic models , 2011, KDD '11 Tutorials.

[7]  Andrew Zisserman,et al.  Geometric Latent Dirichlet Allocation on a Matching Graph for Large-scale Image Datasets , 2011, International Journal of Computer Vision.

[8]  Daniel J. Fagnant,et al.  Preparing a Nation for Autonomous Vehicles: Opportunities, Barriers and Policy Recommendations , 2015 .

[9]  Michael Sivak,et al.  A Survey of Public Opinion about Autonomous and Self-Driving Vehicles in the U.S., the U.K., and Australia , 2014 .

[10]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  M. König,et al.  Users’ resistance towards radical innovations: The case of the self-driving car , 2017 .

[13]  K. Kockelman,et al.  Are we ready to embrace connected and self-driving vehicles? A case study of Texans , 2016, Transportation.

[14]  Kurt Hornik,et al.  topicmodels : An R Package for Fitting Topic Models , 2016 .

[15]  M. Narasimha Murty,et al.  On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations , 2010, PAKDD.

[16]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[17]  Michael W. Godfrey,et al.  Automated topic naming to support cross-project analysis of software maintenance activities , 2011, MSR '11.

[18]  Santonu Sarkar,et al.  Mining business topics in source code using latent dirichlet allocation , 2008, ISEC '08.

[19]  Peter Davidson,et al.  AUTONOMOUS VEHICLES - WHAT COULD THIS MEAN FOR THE FUTURE OF TRANSPORT? , 2015 .

[20]  Lise Getoor,et al.  A Latent Dirichlet Model for Unsupervised Entity Resolution , 2005, SDM.

[21]  Danielle Dai,et al.  Public Perceptions of Self-Driving Cars: The Case of Berkeley, California , 2014 .

[22]  Keshav Bimbraw,et al.  Autonomous cars: Past, present and future a review of the developments in the last century, the present scenario and the expected future of autonomous vehicle technology , 2015, 2015 12th International Conference on Informatics in Control, Automation and Robotics (ICINCO).

[23]  Stephen M. Casner,et al.  The challenges of partially automated driving , 2016, Commun. ACM.

[24]  Michael Hahsler,et al.  Visualizing Association Rules : Introduction to the R-extension Package arulesViz , 2011 .

[25]  Patrice Bellot,et al.  Accurate and effective latent concept modeling for ad hoc information retrieval , 2014, Document Numérique.

[26]  Adèle Paul-Hus,et al.  The journal coverage of Web of Science and Scopus: a comparative analysis , 2015, Scientometrics.

[27]  Sheng Tang,et al.  A density-based method for adaptive LDA model selection , 2009, Neurocomputing.