Secondary Learning and Kernel Initialization on Auto-Tagging of Music Events Using Convolutional Neural Networks

In this paper, we show a deep secondary learning method using convolutional neural networks (CNN) that takes into account the prediction results of a previous tagging method. In particular, we initialize the values of the kernel functions by sampling from some existing instrument signal patterns. We evaluate the tagging on male, female and no-vocal events in 100 popular songs. The performance increases 14.8%, 24.96%, 2.97% in precision, recall and accuracy respectively in comparison with the previous method.