Robust online music identification using spectral entropy in the compressed domain

Audio identification has been an active research field with wide applications for years. However, most of previously reported methods work on the raw audio format in spite of the fact that nowadays compressed format audio, especially MP3 music, has grown into the dominant way to transmit on the Internet. So far, most of the previous methods take advantage of MDCT coefficients or derived energy type of features. As a first attempt, in this paper we propose a novel audio fingerprinting algorithm utilizing compressed-domain spectral entropy as audio features. Such fingerprint exhibits strong robustness against various audio signal distortions such as recompression, noise interference, echo addition, equalization, band-pass filtering, pitch shifting, and moderate time-scale modification etc. In addition, the algorithm for compressed-domain can be applied in Internet of Things (IoT). Experimental results show that in our test database which is composed of 9823 popular songs, a 5s music clip is able to transmit in IoT and identify its original recording, with more than 90% top five precision rate even under the above severe time-frequency audio signal distortions.

[1]  Chih-Chin Liu,et al.  Content-based retrieval of MP3 music objects , 2001, CIKM '01.

[2]  Wen-Nung Lie,et al.  Content-based retrieval of MP3 songs based on query by singing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Gaël Richard,et al.  A Scalable Audio Fingerprint Method with Robustness to Pitch-Shifting , 2011, ISMIR.

[4]  Tsung-Han Tsai,et al.  Content-Based Retrieval of Mp3 Songs For One Singer Using Quantization Tree Indexing and Melody-Line Tracking Method , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Feng Xia,et al.  Detecting Hot Road Mobility of Vehicular Ad Hoc Networks , 2013, Mob. Networks Appl..

[7]  Tsung-Han Tsai,et al.  Content-based retrieval of audio example on MP3 compression domain , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[8]  Alessio Brutti,et al.  Sub-band spectral variance feature for noise robust ASR , 2011, 2011 19th European Signal Processing Conference.

[9]  Xiangyang Xue,et al.  Localized audio watermarking technique robust against time-scale modification , 2006, IEEE Trans. Multim..

[10]  Marios Poulos,et al.  Audio Fingerprint Extraction Using an Adapted Computational Geometry Algorithm , 2012, Comput. Inf. Sci..

[11]  Pedro Cano,et al.  A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..