论文信息 - GPU-based acoustic feature extraction for electronic media processing

GPU-based acoustic feature extraction for electronic media processing

Multicore architectures are frequently utilized if very high computation power is required. At the same time current multicore graphic processing units (GPUs), designed for parallel data processing, have become applicable for general purpose computation. Thus, in current research projects the usage of GPUs is examined for a variety of applications. Thereby, GPUs are attractive for the realization of complex multimedia signal processing in terms of reducing computation time. An example for the processing of electronic media content is the automated classification of music collections, which is a highly attractive feature for multimedia terminals. Such tasks are based on the extraction of acoustic features, which are required to analyse the audio content. The extraction process is highly computation intensive and can benefit from the parallel computation power of GPUs. In this work, scalable parallelization methods are presented for GPU-based feature extraction applied to huge databases. The advantages of such a GPU realization are verified by a quantitative comparison to the results of a single core processor implementation in terms of computation times.

Holger Blume | Ingo Schmadecke | Jonas Morschbach

[1] George Tzanetakis,et al. Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[2] Gerald Schuller,et al. A Fast Feature Extraction System on Compressed Audio Data , 2008 .

[3] Beth Logan,et al. Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[4] Changsheng Xu,et al. Automatic music classification and summarization , 2005, IEEE Transactions on Speech and Audio Processing.

[5] John Saunders,et al. Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[7] Youngmoo E. Kim,et al. Efficient Acoustic Feature Extraction for Music Information Retrieval Using Programmable Gate Arrays , 2009, ISMIR.

[8] Perfecto Herrera-Boyer,et al. Automatic Classification of Musical Instrument Sounds , 2003 .

[9] Wonyong Sung,et al. Parallel scalability in speech recognition , 2009, IEEE Signal Processing Magazine.

[10] Thomas Sikora,et al. New Real-Time Approaches for Video-Genre-Classification Using High-Level Descriptors and a Set of Classifiers , 2008, 2008 IEEE International Conference on Semantic Computing.

[11] Thomas Sikora,et al. Real-Time Approaches for Video-Genre-Classification using New High-Level Descriptors and a Set of Classifiers , 2008 .

[12] Petri Toiviainen,et al. MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio , 2007, ISMIR.