Finding the optimum training solution for Byzantine music recognition — A Max/Msp approach

This paper presents a methodology implementation of a turn-key training system for music (singing voice). The implementation took place with Max/Msp development software. Initially we create a small corpus of anthems for testing purposes by recording four(4) small hymns. Each hymn is been performed three(3) times from the same chanter. The reason of that repetition answers the purpose of finding the best performance(s) - therefore called optimum solution(s) - of the hymns. The process starts by extracting time series vectors from recorded wav files. After the extraction a statistical method for cross validation, called Jackknife, is applied in order to find the optimum solution(s) which will be used to train the system. Once the optimum solution(s) is(are) found, the system is ready to be trained. The training and recognition procedures takes place in real time via intelligent techniques which are a combination of Hidden Markov Models(HMM) and Dynamic Time Wrapping(DTW) algorithms. The evaluation of the system takes place simultaneously with Jackknife highlight and the optimum training solution is been highlighted at the same time. Precision and Recall metrics, are being estimated in order to validate the use of correct singing performance.