Unsupervised feature learning for bootleg detection using deep learning architectures

The widespread diffusion of portable devices capable of capturing high-quality multimedia data, together with the rapid proliferation of media sharing platforms, has determined an incredible growth of user-generated content available online. Since it is hard to strictly regulate this trend, illegal diffusion of copyrighted material is often likely to occur. This is the case of audio bootlegs, i.e., concerts illegally recorded and redistributed by fans. In this paper, we propose a bootleg detector, with the aim of disambiguating between: i) bootlegs unofficially recorded; ii) live concerts officially published; iii) studio recordings from officially released albums. The proposed method is based on audio feature analysis and machine learning techniques. We exploit a deep learning paradigm to extract highly characterizing features from audio excerpts, and a supervised classifier for detection. The method is validated against a dataset of nearly 500 songs, and results are compared to a state-of-the-art detector. The conducted experiments confirm the capability of deep learning techniques to outperform classic feature extraction approaches.

[1]  Thomas Serre,et al.  A quantitative theory of immediate visual recognition. , 2007, Progress in brain research.

[2]  Jean-Luc Dugelay,et al.  A guide tour of video watermarking , 2003, Signal Process. Image Commun..

[3]  Augusto Sarti,et al.  Searching for dominant high-level features for Music Information Retrieval , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[4]  Paolo Bestagini,et al.  Video recapture detection based on ghosting artifact analysis , 2013, 2013 IEEE International Conference on Image Processing.

[5]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[6]  Paolo Bestagini,et al.  Audio tampering detection using multimodal features , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Gerald Friedland,et al.  Name that room: room identification using acoustic features in a recording , 2012, ACM Multimedia.

[8]  Marco Tagliasacchi,et al.  Audio tampering detection via microphone classification , 2013, 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP).

[9]  Douglas Eck,et al.  Learning Features from Music Audio with Deep Belief Networks , 2010, ISMIR.

[10]  2014 IEEE International Workshop on Information Forensics and Security, WIFS 2014, Atlanta, GA, USA, December 3-5, 2014 , 2014, WIFS.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[13]  Pier Luigi Dragotti,et al.  Video jitter analysis for automatic bootleg detection , 2012, 2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP).

[14]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[15]  Youngmoo E. Kim,et al.  Learning emotion-based acoustic features with deep belief networks , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[16]  Babak Mahdian,et al.  A bibliography on blind methods for identifying image forgery , 2010, Signal Process. Image Commun..

[17]  Thomas Sikora,et al.  MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval , 2005 .

[18]  Ingemar J. Cox,et al.  Digital Watermarking and Steganography , 2014 .

[19]  Daniel Garcia-Romero,et al.  Automatic acquisition device identification from speech recordings , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Yann LeCun,et al.  Feature learning and deep architectures: new directions for music informatics , 2013, Journal of Intelligent Information Systems.

[21]  Augusto Sarti,et al.  Feature-based classification for audio bootlegs detection , 2013, 2013 IEEE International Workshop on Information Forensics and Security (WIFS).

[22]  Ton Kalker,et al.  A watermarking scheme for digital cinema , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).