Machine Learning Algorithms for Environmental Sound Recognition: Towards Soundscape Semantics

This paper investigates methods aiming at the automatic recognition and classification of discrete environmental sounds, for the purpose of subsequently applying these methods to the recognition of soundscapes. Research in audio recognition has traditionally focused on the domains of speech and music. Comparatively little research has been done towards recognizing non-speech environmental sounds. For this reason, in this paper, we apply existing techniques that have been proved efficient in the other two domains. These techniques are comprehensively compared to determine the most appropriate one for addressing the problem of environmental sound recognition.

[1]  Buket D. Barkana,et al.  NON-SPEECH ENVIRONMENTAL SOUND CLASSIFICATION USING SVMS WITH A NEW SET OF FEATURES , 2012 .

[2]  George Kalliris,et al.  Mobile Audio Measurements Platform: Toward Audio Semantic Intelligence into Ubiquitous Computing Environments , 2013 .

[3]  Jamie Bullock,et al.  Libxtract: a Lightweight Library for audio Feature Extraction , 2007, ICMC.

[4]  C.-C. Jay Kuo,et al.  Content/context-adaptive feature selection for environmental sound recognition , 2012, Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference.

[5]  Charalampos Dimoulas,et al.  Audio content annotation, description and management using joint audio detection, segmentation and classification techniques , 2009 .

[6]  George Kalliris,et al.  Investigation of Wavelet Approaches for Joint Temporal, Spectral and Cepstral Features in Audio Semantics , 2013 .

[7]  H. Eidenberger,et al.  On feature selection in environmental sound recognition , 2009, 2009 International Symposium ELMAR.

[8]  Mark B. Sandler,et al.  The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals , 2006, ISMIR.

[9]  Michael A. Cowling,et al.  Non-Speech Environmental Sound Classification System for Autonomous Surveillance , 2004 .

[10]  Shrikanth Narayanan,et al.  Environmental Sound Recognition With Time–Frequency Audio Features , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Rigas Kotsakis,et al.  Investigation of Salient Audio-Features for Pattern-Based Semantic Content Analysis of Radio Productions , 2012 .

[12]  George Kalliris,et al.  Automated audio detection, segmentation and indexing, with application to post-production editing , 2007 .

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Nikos Fakotakis,et al.  Automatic Recognition of Urban Environmental Sound Events , 2008 .

[15]  George Kalliris,et al.  Collaborative Annotation Platform for Audio Semantics , 2013 .

[16]  Justin Salamon,et al.  MIR.EDU: AN OPEN-SOURCE LIBRARY FOR TEACHING SOUND AND MUSIC DESCRIPTION , 2014 .

[17]  Renate Sitte,et al.  Comparison of techniques for environmental sound recognition , 2003, Pattern Recognit. Lett..

[18]  Christian Breiteneder,et al.  Features for Content-Based Audio Retrieval , 2010, Adv. Comput..

[19]  Rigas Kotsakis,et al.  Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification , 2012, Speech Commun..

[20]  C.-C. Jay Kuo,et al.  Environmental sound recognition: A survey , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.