Robust audio surveillance using spectrogram image texture feature

A sound signal produces a unique texture which can be visualized using a spectrogram image and analyzed for automatic sound recognition. In this paper, we explore the use of a well-known image texture analysis technique called the gray-level co-occurrence matrix (GLCM) for sound recognition in an audio surveillance application. The GLCM captures the distribution of co-occurring values at a given offset. Unlike most other similar research which derive features from the GLCM, we use the matrix values itself to form the feature vector with analysis carried out in subbands. When compared to a baseline feature from related work, the proposed spectrogram image texture feature (SITF) gives marginally lower results under clean and high signal-to-noise ratio (SNR) conditions but significantly better results are achieved at low SNR, where the baseline feature was seen to be less effective.

[1]  Luiz S. Oliveira,et al.  Music genre recognition using spectrograms , 2011, 2011 18th International Conference on Systems, Signals and Image Processing.

[2]  Koji Abe,et al.  Sound classification for hearing aids using time-frequency images , 2011, Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.

[3]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  Xin Wang,et al.  GLCM texture based fractal method for evaluating fabric surface roughness , 2009, 2009 Canadian Conference on Electrical and Computer Engineering.

[6]  S. Suresh Kumar,et al.  Color based Urban and Agricultural Land classification by GLCM Texture Features , 2012 .

[7]  Haizhou Li,et al.  Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions , 2011, IEEE Signal Processing Letters.

[8]  Zhen Zhang,et al.  Auto-classification of insect images based on color histogram and GLCM , 2010, 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery.

[9]  Laura Schweitzer,et al.  Advances In Kernel Methods Support Vector Learning , 2016 .

[10]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[11]  Boonserm Kijsirikul,et al.  Adaptive Directed Acyclic Graphs for Multiclass Classification , 2002, PRICAI.

[12]  Alaa Eleyan,et al.  Co-occurrence matrix and its statistical features as a new approach for face recognition , 2011, Turkish Journal of Electrical Engineering and Computer Sciences.

[13]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[14]  Delia Mitrea,et al.  Texture based characterization and automatic diagnosis of the abdominal tumors from ultrasound images using third order GLCM features , 2011, 2011 4th International Congress on Image and Signal Processing.

[15]  Satoshi Nakamura,et al.  Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition , 2000, LREC.

[16]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[17]  Tom J. Moir,et al.  Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM , 2015, Neurocomputing.

[18]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.