Objective characterization of audio signal quality: Applications to music collection description

In this paper, we propose a set of audio features to describe the quality of an audio signal. Audio quality is here considered as being modified by the chain of processes/effects applied to the individual instrument tracks to obtain the final mix of a musical piece. Thus, the quality also depends on the mastering processes applied to the final mix or the signal degradation caused by MP3 compression. To evaluate our proposal, we created a large set of artificial mixes and also used real-world studio mixes. Using unsupervised and supervised classification methods, we show that our proposed audio features can detect the processing chain. Since this processing chain applied in professional studio has evolved over the years, we use our audio features to directly predict the decade during which a music track was recorded.

[1]  Bruno Fazenda,et al.  Perception and automated assessment of audio quality in user generated content: An improved model , 2016, 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).

[2]  Rui Yang,et al.  Compression history identification for digital audio signal , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  C. Avendano,et al.  The CIPIC HRTF database , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[4]  J. Schmee An Introduction to Multivariate Statistical Analysis , 1986 .

[5]  Tim Pohle,et al.  AUTOMATIC MUSIC DETECTION IN TELEVISION PRODUCTIONS , 2007 .

[6]  Arthur Flexer,et al.  A Closer Look on Artist Filters for Musical Genre Classification , 2007, ISMIR.

[7]  Yann Guermeur,et al.  MSVMpack: A Multi-Class Support Vector Machine Package , 2011, J. Mach. Learn. Res..

[8]  Liwei Wang,et al.  Smoothness, Disagreement Coefficient, and the Label Complexity of Agnostic Active Learning , 2011, J. Mach. Learn. Res..

[9]  Joshua D. Reiss,et al.  Spectral Characteristics of Popular Commercial Recordings 1950-2010 , 2013 .

[10]  Matthias Mauch,et al.  MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research , 2014, ISMIR.

[11]  Daniele Barchiesi,et al.  Reverse Engineering of a Mix , 2010 .

[12]  Daniele Barchiesi,et al.  AUTOMATIC TARGET MIXING USING LEAST-SQUARES OPTIMIZATION OF GAINS AND EQUALIZATION SETTINGS , 2009 .

[13]  C. Avendano,et al.  Frequency-domain source identification and manipulation in stereo mixes for enhancement, suppression and re-panning applications , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[14]  Francis F. Li,et al.  Perceived Audio Quality of Sounds Degraded by Non-linear Distortions and Single-Ended Assessment Using HASQI , 2015 .

[15]  Stanislaw Gorlow,et al.  Reverse Engineering Stereo Music Recordings Pursuing an Informed Two-Stage Approach , 2013 .

[16]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[17]  Bruno Fazenda,et al.  Variation in Multitrack Mixes: Analysis of Low-level Audio Signal Features , 2016 .

[18]  Damien Tardieu,et al.  Production Effect: Audio Features For Recording Techniques Description And Decade Prediction , 2011 .

[19]  Bruno Fazenda,et al.  Perception of Audio Quality in Productions of Popular Music , 2016 .

[20]  G. Peeters Automatic Classification of Large Musical Instrument Databases Using Hierarchical Classifiers with Inertia Ratio Maximization , 2003 .