VISA: The Voice Integration/Segregation Algorithm

Listeners are capable to perceive multiple voices in music. Adopting a perceptual view of musical ‘voice’ that corresponds to the notion of auditory stream, a computational model is developed that splits musical scores (symbolic musical data) into different voices. A single ‘voice’ may consist of more than one synchronous notes that are perceived as belonging to the same auditory stream; in this sense, the proposed algorithm, may separate a given musical work into fewer voices than the maximum number of notes in the greatest chord. This is paramount, among other, for developing MIR systems that enable pattern recognition and extraction within musically pertinent ‘voices’ (e.g. melodic lines). The algorithm is tested against a small dataset that acts as groundtruth.