Examining Synthetic Databases in Melodic Retrieval Testing

We investigate the practice of using probabilistically generated melodies to do large scale evaluations of query-byhumming systems by running a set of sung queries against both real and synthetic databases, using an already verified type of “matching function” to map sung queries to the appropriate target. We find that the accuracy of the generative process can be improved by introducing a first-order Markov assumption into the model, though neither method of melodic generation is found to be a statistically consistant approximation of an actual database under our experimental conditions.