Audio Data Mining for Anthropogenic Disaster Identification: An Automatic Taxonomy Approach

Disasters are undesirable and often sudden events causing human, material and economic losses, which exceed the coping capability of the affected community or society. In recent years, with significant advancement in information technology, various intelligent systems have been developed to support various aspects of disaster management, including emergency prediction, timely response and aftermath recovery. This paper addresses the anthropogenic disaster identification issue by exploiting ambient sound data. Specifically, a novel and efficient acoustic event classification scheme is proposed, which is based on unsupervised acoustic feature learning and data-driven taxonomy. The proposed framework could accurately identify anthropogenic disaster events, e.g., gun shot, explosion, scream cry, etc. from dynamic audio data. and it consists of three major stages as follows. First, predominant acoustic patterns are characterized by dictionary learning algorithms, which can generate robust acoustic feature representations for recognition under noisy conditions. Second, hazard sound event taxonomy is created by exploiting probabilistic distances between extracted class-wise dictionaries. Finally, taxonomy structure is embedded into hierarchical classification algorithm to improve event identification performance. The Proposed approach is evaluated using real-world dataset with 10 emergency sound categories and 3,275 clips. According to extensive experimental comparisons, proposed approach achieved state-of-the-art performance in anthropogenic disaster identification.

[1]  Osamu Yamaguchi,et al.  Face Recognition Using Multi-viewpoint Patterns for Robot Vision , 2003, ISRR.

[2]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[3]  Justin Salamon,et al.  A Dataset and Taxonomy for Urban Sound Research , 2014, ACM Multimedia.

[4]  Chloé Clavel,et al.  Events Detection for an Audio-Based Surveillance System , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[5]  Karol J. Piczak Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[6]  Shrikanth Narayanan,et al.  Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[8]  R. Radhakrishnan,et al.  Audio analysis for surveillance applications , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[9]  Elise Miller-Hooks,et al.  Measuring the performance of transportation infrastructure systems in disasters: a comprehensive review , 2015 .

[10]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[11]  Takumi Kobayashi,et al.  Acoustic Scene Classification based on Sound Textures and Events , 2015, ACM Multimedia.

[12]  Takumi Kobayashi,et al.  Robust acoustic feature extraction for sound classification based on noise reduction , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Regunathan Radhakrishnan,et al.  Systematic acquisition of audio classes for elevator surveillance , 2005, IS&T/SPIE Electronic Imaging.

[14]  Nicolai Petkov,et al.  Reliable detection of audio events in highly noisy environments , 2015, Pattern Recognit. Lett..

[15]  Andrew Y. Ng,et al.  Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.

[16]  Augusto Sarti,et al.  Scream and gunshot detection in noisy environments , 2007, 2007 15th European Signal Processing Conference.

[17]  Hui Li,et al.  Natural Disaster Monitoring with Wireless Sensor Networks: A Case Study of Data-intensive Applications upon Low-Cost Scalable Systems , 2013, Mob. Networks Appl..

[18]  Jochen Zschau,et al.  Early Warning Systems for Natural Disaster Reduction , 2003 .

[19]  Heikki Huttunen,et al.  Recurrent neural networks for polyphonic sound event detection in real life recordings , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[20]  Yiming Yang,et al.  Recursive regularization for large-scale classification with hierarchical and graphical dependencies , 2013, KDD.

[21]  Anupam Agrawal,et al.  A survey on activity recognition and behavior understanding in video surveillance , 2012, The Visual Computer.

[22]  Joel J. P. C. Rodrigues,et al.  Intelligent Mobile Video Surveillance System as a Bayesian Coalition Game in Vehicular Sensor Networks: Learning Automata Approach , 2015, IEEE Transactions on Intelligent Transportation Systems.

[23]  CongDuc Pham,et al.  Streaming the Sound of Smart Cities: Experimentations on the SmartSantander Test-Bed , 2013, 2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing.

[24]  Ken-ichi Maeda,et al.  Face recognition using temporal image sequence , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[25]  Daniel D. Lee,et al.  Extended Grassmann Kernels for Subspace-Based Learning , 2008, NIPS.

[26]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[27]  Matthew H. Davis,et al.  Speech recognition in adverse conditions: A review , 2012 .

[28]  Max Lu,et al.  Robust and efficient foreground analysis for real-time video surveillance , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Karol J. Piczak ESC: Dataset for Environmental Sound Classification , 2015, ACM Multimedia.

[30]  John H. L. Hansen,et al.  Robust unsupervised detection of human screams in noisy acoustic environments , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Yi-Hsuan Yang,et al.  Machine Recognition of Music Emotion: A Review , 2012, TIST.

[32]  C. Sabel,et al.  Quantifying human exposure to air pollution--moving from static monitoring to spatio-temporally resolved personal exposure assessment. , 2013, The Science of the total environment.

[33]  Francesc Alías,et al.  Gammatone Cepstral Coefficients: Biologically Inspired Features for Non-Speech Audio Classification , 2012, IEEE Transactions on Multimedia.

[34]  Vittorio Murino,et al.  Audio Surveillance , 2014, ACM Comput. Surv..

[35]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[36]  Danilo Hollosi,et al.  Energy Based Traffic Density Estimation Using Embedded Audio Processing Unit , 2014 .

[37]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[38]  Renate Sitte,et al.  Comparison of techniques for environmental sound recognition , 2003, Pattern Recognit. Lett..