Graph Cepstrum: Spatial Feature Extracted from Partially Connected Microphones

In this paper, we propose an effective and robust method of spatial feature extraction for acoustic scene analysis utilizing partially synchronized and/or closely located distributed microphones. In the proposed method, a new cepstrum feature utilizing a graph-based basis transformation to extract spatial information from distributed microphones, while taking into account whether any pairs of microphones are synchronized and/or closely located, is introduced. Specifically, in the proposed graph-based cepstrum, the log-amplitude of a multichannel observation is converted to a feature vector utilizing the inverse graph Fourier transform, which is a method of basis transformation of a signal on a graph. Results of experiments using real environmental sounds show that the proposed graph-based cepstrum robustly extracts spatial information with consideration of the microphone connections. Moreover, the results indicate that the proposed method more robustly classifies acoustic scenes than conventional spatial features when the observed sounds have a large synchronization mismatch between partially synchronized microphone groups.

[1]  Kyogu Lee,et al.  Convolutional Neural Networks with Binaural Representations and Background Subtraction for Acoustic Scene Classification , 2017, DCASE.

[2]  Keisuke Imoto,et al.  Introduction to acoustic event and scene analysis , 2018 .

[3]  Jörn Anemüller,et al.  Classification of human cough signals using spectro-temporal Gabor filterbank features , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[5]  Ankit Shah,et al.  DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System , 2017, DCASE.

[6]  Suehiro Shimauchi,et al.  Acoustic Scene Analysis Based on Hierarchical Generative Model of Acoustic Event Sequence , 2016, IEICE Trans. Inf. Syst..

[7]  Nobutaka Ono,et al.  Spatial Cepstrum as a Spatial Feature Using a Distributed Microphone Array for Acoustic Scene Analysis , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[8]  Julien Pinquier,et al.  Water sound recognition based on physical models , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Tuomas Virtanen,et al.  ACOUSTIC SCENE CLASSIFICATION USING CONVOLUTIONAL RECURRENT NEURAL NETWORKS , 2017 .

[10]  Marian Verhelst,et al.  The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network , 2017, DCASE.

[11]  Hirokazu Kameoka,et al.  Bayesian semi-supervised audio event transcription based on Markov indian buffet process , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Petros Maragos,et al.  Multi-room speech activity detection using a distributed microphone network in domestic environments , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[13]  Gernot A. Fink,et al.  BAG-OF-FEATURES ACOUSTIC EVENT DETECTION FOR SENSOR NETWORKS , 2016 .

[14]  Vesa T. Peltonen,et al.  Audio-based context recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Ching-Yung Lin,et al.  Healthcare audio event classification using Hidden Markov Models and Hierarchical Hidden Markov Models , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[16]  Nikos Fakotakis,et al.  On acoustic surveillance of hazardous situations , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[17]  Aren Jansen,et al.  Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Nobutaka Ono,et al.  Acoustic scene classification based on generative model of acoustic spatial words for distributed microphone array , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[19]  Reinhold Häb-Umbach,et al.  Sampling rate synchronisation in acoustic sensor networks with a pre-trained clock skew error model , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[20]  Daniele Battaglino,et al.  Acoustic scene classification using convolutional neural networks , 2016 .

[21]  Nobutaka Ito,et al.  Blind alignment of asynchronously recorded signals for distributed microphone array , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[22]  Shoji Makino,et al.  Blind compensation of interchannel sampling frequency mismatch for ad hoc microphone array based on maximum likelihood estimation , 2015, Signal Process..

[23]  Florian Metze,et al.  Event-based Video Retrieval Using Audio , 2012, INTERSPEECH.

[24]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[25]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[26]  Visar Berisha,et al.  A sensor network for real-time acoustic scene analysis , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[27]  Alexander G. Hauptmann,et al.  Temporal localization of audio events for conflict monitoring in social media , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Shigeki Sagayama,et al.  Blind Estimation of Locations and Time Offsets for Distributed Recording Devices , 2010, LVA/ICA.

[29]  Janto Skowronek,et al.  Automatic surveillance of the acoustic activity in our living environment , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[30]  Yasunori Ohishi,et al.  Acoustic scene analysis based on latent acoustic topic and event allocation , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[31]  R. Radhakrishnan,et al.  Audio analysis for surveillance applications , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[32]  Bart Vanrumste,et al.  DCASE 2018 Challenge - Task 5: Monitoring of domestic activities based on multi-channel acoustics , 2018, ArXiv.

[33]  Shrikanth S. Narayanan,et al.  Acoustic topic model for audio information retrieval , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[34]  Tuomas Virtanen,et al.  Acoustic event detection in real life recordings , 2010, 2010 18th European Signal Processing Conference.

[35]  Nobutaka Ono,et al.  Spatial-feature-based acoustic scene analysis using distributed microphone array , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).