Understanding and personalising smart city services using machine learning, The Internet-of-Things and Big Data

This paper explores the potential of Machine Learning (ML) and Artificial Intelligence (AI) to lever Internet of Things (IoT) and Big Data in the development of personalised services in Smart Cities. We do this by studying the performance of four well-known ML classification algorithms (Bayes Network (BN), Naïve Bayesian (NB), J48, and Nearest Neighbour (NN)) in correlating the effects of weather data (especially rainfall and temperature) on short journeys made by cyclists in London. The performance of the algorithms was assessed in terms of accuracy, trustworthy and speed. The data sets were provided by Transport for London (TfL) and the UK MetOffice. We employed a random sample of some 1,800,000 instances, comprising six individual datasets, which we analysed on the WEKA platform. The results revealed that there were a high degree of correlations between weather-based attributes and the Big Data being analysed. Notable observations were that, on average, the decision tree J48 algorithm performed best in terms of accuracy while the kNN IBK algorithm was the fastest to build models. Finally we suggest IoT Smart City applications that may benefit from our work.

[1]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[2]  Efthalia Karydi,et al.  Parallel and Distributed Collaborative Filtering , 2014, ACM Comput. Surv..

[3]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[4]  Analía Amandi,et al.  User profiling for Web page filtering , 2005, IEEE Internet Computing.

[5]  Amit Kumar Yadav,et al.  Solar energy potential assessment of western Himalayan Indian state of Himachal Pradesh using J48 algorithm of WEKA in ANN based prediction model , 2015 .

[6]  Roman V. Yampolskiy,et al.  The technological singularity , 2017 .

[7]  Juan C. Burguillo,et al.  A hybrid content-based and item-based collaborative filtering approach to recommend TV programs enhanced with singular value decomposition , 2010, Inf. Sci..

[8]  Jianping Zhang,et al.  Rule-Based Platform for Web User Profiling , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  D. Heredia,et al.  Student Dropout Predictive Model Using Data Mining Techniques , 2015, IEEE Latin America Transactions.

[10]  Mahi Lohi,et al.  A Comparative Study of Selected Classifiers with Classification Accuracy in User Profiling , 2009, 2009 WRI World Congress on Computer Science and Information Engineering.

[11]  Shinjee Pyo,et al.  An Automatic Recommendation Scheme of TV Program Contents for (IP)TV Personalization , 2011, IEEE Transactions on Broadcasting.

[12]  Yi Pan,et al.  Novel hybrid hierarchical-K-means clustering method (H-K-means) for microarray analysis , 2005, 2005 IEEE Computational Systems Bioinformatics Conference - Workshops (CSBW'05).

[13]  Plamen P. Angelov,et al.  Creating Evolving User Behavior Profiles Automatically , 2012, IEEE Transactions on Knowledge and Data Engineering.

[14]  Donald H. Kraft,et al.  User profiles and fuzzy logic for web retrieval issues , 2002, Soft Comput..

[15]  Senthil Kumar Palanisamy,et al.  Association Rule Based Classification , 2006 .

[16]  Amir Albadvi,et al.  A hybrid recommendation technique based on product category attributes , 2009, Expert Syst. Appl..

[17]  Fernando Ortega,et al.  A framework for collaborative filtering recommender systems , 2011, Expert Syst. Appl..

[18]  Liangxiao Jiang,et al.  A Novel Bayes Model: Hidden Naive Bayes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[19]  Suman,et al.  Comparative Analysis of Classification Algorithms on Different Datasets using WEKA , 2012 .

[20]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[21]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[22]  Antonio Iera,et al.  Adaptively controlling the QoS of multimedia wireless applications through "user profiling" techniques , 2003, IEEE J. Sel. Areas Commun..

[23]  Betim Cico,et al.  Comparative analysis of classification algorithms on three different datasets using WEKA , 2016, 2016 5th Mediterranean Conference on Embedded Computing (MECO).