Intonation: A Dataset of Quality Vocal Performances Refined by Spectral Clustering on Pitch Congruence

We introduce the "Intonation" dataset of amateur vocal performances with a tendency for good intonation, collected from Smule, Inc. The dataset can be used for music information retrieval tasks such as autotuning, query by humming, and singing style analysis. It is available upon request on the Stanford CCRMA DAMP website.1 We describe a semi-supervised approach to selecting the audio recordings from a larger collection of performances based on intonation patterns. The approach can be applied in other situations where a researcher needs to extract a subset of data samples from a large database. A comparison of the "Intonation" dataset and the remaining collection of performances shows that the two have different intonation behavior distributions.

[1]  Graham Hair,et al.  A Psychocultural Theory of Musical Interval: Bye Bye Pythagoras , 2018 .

[2]  Tuomas Virtanen,et al.  A multi-device dataset for urban acoustic scene classification , 2018, DCASE.

[3]  Bryan Pardo,et al.  Crowdsourcing A Real-World On-Line Query By Humming System , 2010 .

[4]  Ichiro Fujinaga,et al.  Intonation in solo vocal performance: A study of semitone and whole tone tuning in undergraduate and professional sopranos , 2011 .

[5]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[6]  Colin Raffel,et al.  librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.

[7]  Eric Nichols,et al.  Automatically Discovering Talented Musicians with Acoustic Analysis of YouTube Videos , 2012, 2012 IEEE 12th International Conference on Data Mining.

[8]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Nicolas Sturmel,et al.  Informed audio source separation: A comparative study , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[11]  Max Schoen,et al.  PITCH AND VIBRATO IN ARTISTIC SINGING AN EXPERIMENTAL STUDY , 1926 .

[12]  J. Barbour,et al.  JUST INTONATION CONFUTED , 1938 .

[13]  Paris Smaragdis,et al.  Separation by “humming”: User-guided sound extraction from monophonic mixtures , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[14]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[15]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[16]  George Tzanetakis,et al.  Deep Autotuner: A Data-Driven Approach to Natural-Sounding Pitch Correction for Singing Voice in Karaoke Performances , 2019, ArXiv.

[17]  Antoine Liutkus,et al.  The 2016 Signal Separation Evaluation Campaign , 2017, LVA/ICA.

[18]  Malgorzata Lucinska,et al.  Spectral Clustering Based on k-Nearest Neighbor Graph , 2012, CISIM.

[19]  Christopher Raphael,et al.  InTune: A System to Support an Instrumentalist's Visualization of Intonation , 2010, Computer Music Journal.