Incorporating prior knowledge on the digital media creation process into audio classifiers

In the process of music content creation, a wide range of typical audio effects such as reverberation, equalization or dynamic compression are very commonly used. Despite the fact that such effects have a clear impact on the audio features, they are rarely taken into account when building an automatic audio classifier. In this paper, it is shown that the incorporation of prior knowledge of the digital media creation chain can clearly improve the robustness of the audio classifiers, which is demonstrated on a task of musical instrument recognition. The proposed system is based on a robust feature selection strategy, on a novel use of the virtual support vector machines technique and a specific equalization used to normalize the signals to be classified. The robustness of the proposed system is experimentally evidenced using a rather large and varied sound database.