An architectural framework for information integration using machine learning approaches for smart city security profiling

In the past few decades, the whole world has been badly affected by terrorism and other law-and-order situations. The newspapers have been covering terrorism and other law-and-order issues with relevant details. However, to the best of our knowledge, there is no existing information system that is capable of accumulating and analyzing these events to help in devising strategies to avoid and minimize such incidents in future. This research aims to provide a generic architectural framework to semi-automatically accumulate law-and-order-related news through different news portals and classify them using machine learning approaches. The proposed architectural framework discusses all the important components that include data ingestion, preprocessor, reporting and visualization, and pattern recognition. The information extractor and news classifier have been implemented, whereby the classification sub-component employs widely used text classifiers for a news data set comprising almost 5000 news manually compiled for this purpose. The results reveal that both support vector machine and multinomial Naïve Bayes classifiers exhibit almost 90% accuracy. Finally, a generic method for calculating security profile of a city or a region has been developed, which is augmented by visualization and reporting components that maps this information onto maps using geographical information system.

[1]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[2]  Khaled Salah,et al.  Trust management in social Internet of vehicles: Factors, challenges, blockchain, and fog solutions , 2019, Int. J. Distributed Sens. Networks.

[3]  Yung-Seop Lee,et al.  Enriched random forests , 2008, Bioinform..

[4]  Brajendra Singh Rajput,et al.  A survey of Stemming Algorithms for Information Retrieval , 2015 .

[5]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[6]  Yunming Ye,et al.  An Improved Random Forest Classifier for Text Categorization , 2012, J. Comput..

[7]  Yan Guo,et al.  ECON: An Approach to Extract Content from Web News Page , 2010, 2010 12th International Asia-Pacific Web Conference.

[8]  Eibe Frank,et al.  Naive Bayes for Text Classification with Unbalanced Classes , 2006, PKDD.

[9]  Shaohua Wan,et al.  Stateful human-centered visual captioning system to aid video surveillance , 2019, Comput. Electr. Eng..

[10]  David D. Lewis,et al.  A comparison of two learning algorithms for text categorization , 1994 .

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Gérard Biau,et al.  Analysis of a Random Forests Model , 2010, J. Mach. Learn. Res..

[13]  Yunming Ye,et al.  A Tree Selection Model for Improved Random Forest , 2011 .

[14]  Loizos Michael,et al.  Using Generic Ontologies to Infer the Geographic Focus of Text , 2018, ICAART.

[15]  Susan T. Dumais,et al.  Using SVMs for Text Categorization , 2016 .

[16]  David A. Hull,et al.  A Detailed Analysis of English Stemming Algorithms , 2006 .

[17]  Davide Eynard,et al.  Search Computing: Managing Complex Search Queries , 2010, IEEE Internet Computing.

[18]  Kohei Watanabe,et al.  Theory-Driven Analysis of Large Corpora: Semisupervised Topic Classification of the UN Speeches , 2020, Social Science Computer Review.

[19]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[20]  Kohei Watanabe,et al.  Newsmap: A semi-supervised approach to geographical news classification , 2018 .

[21]  Hung Hum,et al.  Is Naïve Bayes a Good Classifier for Document Classification , 2011 .

[22]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[23]  Fatih Kurugollu,et al.  Towards a trusted unmanned aerial system using blockchain for the protection of critical infrastructure , 2019 .

[24]  Geoff Holmes,et al.  Multinomial Naive Bayes for Text Categorization Revisited , 2004, Australian Conference on Artificial Intelligence.

[25]  Ashutosh Sharma,et al.  A Secure Communicating Things Network Framework for Industrial IoT using Blockchain Technology , 2019, Ad Hoc Networks.