Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings

Automatic note-level transcription is considered one of the most challenging tasks in music information retrieval. The specific case of flamenco singing transcription poses a particular challenge due to its complex melodic progressions, intonation inaccuracies, the use of a high degree of ornamentation, and the presence of guitar accompaniment. In this study, we explore the limitations of existing state of the art transcription systems for the case of flamenco singing and propose a specific solution for this genre: We first extract the predominant melody and apply a novel contour filtering process to eliminate segments of the pitch contour which originate from the guitar accompaniment. We formulate a set of onset detection functions based on volume and pitch characteristics to segment the resulting vocal pitch contour into discrete note events. A quantised pitch label is assigned to each note event by combining global pitch class probabilities with local pitch contour statistics. The proposed system outperforms state of the art singing transcription systems with respect to voicing accuracy, onset detection, and overall performance when evaluated on flamenco singing datasets.

[1]  Xavier Serra A Multicultural Approach in Music Information Research , 2011, ISMIR.

[2]  E. Pollastri A pitch tracking system dedicated to process singing voice for music retrieval , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[3]  Gregory H. Wakefield,et al.  Note segmentation and quantization for music information retrieval , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Xavier Serra,et al.  ESSENTIA: an open-source library for sound and music analysis , 2013, ACM Multimedia.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Gregory H. Wakefield,et al.  Mathematical representation of joint time-chroma distributions , 1999, Optics & Photonics.

[7]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Anssi Klapuri,et al.  Automatic Music Transcription as We Know it Today , 2004 .

[9]  Simon Dixon,et al.  Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency , 2015 .

[10]  Gaël Richard,et al.  Melody Extraction from Polyphonic Music Signals , 2014 .

[11]  Shingchern D. You,et al.  Comparative study of singing voice detection methods , 2016, Multimedia Tools and Applications.

[12]  Emilia Gómez,et al.  Computational Ethnomusicology: A Study of Flamenco and Arab-Andalusian Vocal Music , 2018 .

[13]  Yuang-chin Chiang,et al.  A robust singing melody tracker using adaptive round semitones (ARS) , 2003, 3rd International Symposium on Image and Signal Processing and Analysis, 2003. ISPA 2003. Proceedings of the.

[14]  Mitra Basu,et al.  Gaussian-based edge-detection methods - a survey , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[15]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Anssi Klapuri,et al.  Transcription of the Singing Melody in Polyphonic Music , 2006, ISMIR.

[17]  Anssi Klapuri,et al.  Automatic music transcription: challenges and future directions , 2013, Journal of Intelligent Information Systems.

[18]  Ian H. Witten,et al.  Signal processing for melody transcription , 1995 .

[19]  E. Zwicker,et al.  Analytical expressions for critical‐band rate and critical bandwidth as a function of frequency , 1980 .

[20]  Karin Dressler,et al.  Tuning Frequency Estimation Using Circular Statistics , 2007, ISMIR.

[21]  José Miguel Díaz-Báñez,et al.  An Efficient DTW-Based Approach for Melodic Similarity in Flamenco Singing , 2014, SISAP.

[22]  Emilia Gómez,et al.  Towards Computer-Assisted Flamenco Transcription: An Experimental Comparison of Automatic Transcription Algorithms as Applied to A Cappella Singing , 2013, Computer Music Journal.

[24]  Graham E. Poliner,et al.  Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Thomas Niesler,et al.  Explicit Transition Modelling for Automatic Singing Transcription , 2008 .

[26]  José Miguel Díaz-Báñez,et al.  Characterization and Similarity in A Cappella Flamenco Cantes , 2010, ISMIR.

[27]  Emilia Gómez,et al.  Automatic Singer Identification For Improvisational Styles Based On Vibrato, Timbre And Statistical Performance Descriptors , 2014, ICMC.

[28]  James A. Moorer,et al.  On the Transcription of Musical Sound by Computer , 2016 .

[29]  Jordi Bonada,et al.  Predominant Fundamental Frequency Estimation vs Singing Voice Separation for the Automatic Transcription of Accompanied Flamenco Singing , 2012, ISMIR.

[30]  José Miguel Díaz-Báñez,et al.  Fitting rectilinear polygonal curves to a set of points in the plane , 2001, Eur. J. Oper. Res..

[31]  Yonghong Yan,et al.  Automatic Vocal Segments Detection in Popular Music , 2013, 2013 Ninth International Conference on Computational Intelligence and Security.

[32]  Emilia Gómez,et al.  Melodic Transcription of Flamenco Singing from Monophonic and Polyphonic Music Recordings , 2012 .

[33]  José Miguel Díaz-Báñez,et al.  Discovery of repeated vocal patterns in polyphonic audio: A case study on flamenco music , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[34]  Emilio Molina,et al.  SiPTH: Singing Transcription Based on Hysteresis Defined on the Pitch-Time Curve , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Antonello Rizzi,et al.  A Correntropy-based voice to MIDI transcription algorithm , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[36]  Perfecto Herrera,et al.  Comparing audio descriptors for singing voice detection in music audio files , 2007 .

[37]  Daniel P. W. Ellis,et al.  Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges , 2014, IEEE Signal Processing Magazine.

[38]  J. M. Gamboa,et al.  Una historia del flamenco , 2005 .

[39]  Xavier Serra,et al.  ESSENTIA: an open source library for audio analysis , 2014 .

[40]  Emilia Gómez,et al.  Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[41]  Emilia Gómez,et al.  Computational Models for Perceived Melodic Similarity in A Cappella Flamenco Singing , 2014, ISMIR.

[42]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.