Exploratory Analysis of a Large Flamenco Corpus using an Ensemble of Convolutional Neural Networks as a Structural Annotation Backend

We present computational tools that we developed for the analysis of a large corpus of flamenco music recordings, along with the related exploratory findings. The proposed computational backend is based on a set of Convolutional Neural Networks that provide the structural annotation of each music recording with respect to the presence of vocals, guitar and hand-clapping ("palmas"). The resulting, automatically extracted annotations, allowed for the visualization of music recordings in structurally meaningful ways, the extraction of global statistics related to the instrumentation of flamenco music, the detection of a cappella and instrumental recordings for which no such information existed, the investigation of differences in structure and instrumentation across styles and the study of tonality across instrumentation and styles. The reported findings show that it is feasible to perform a large scale analysis of flamenco music with state-of-the-art classification technology and produce automatically extracted descriptors that are both musicologically valid and useful, in the sense that they can enhance conventional metadata schemes and assist bridging the semantic gap between audio recordings and high-level musicological concepts.

[1]  Jan Beran,et al.  Statistics in Musicology , 2003 .

[2]  J. White Comprehensive musical analysis , 1994 .

[3]  Bernard Manderick,et al.  Descriptive Subgroup Mining of Folk Music , 2009 .

[4]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Emilia Gómez,et al.  Computational Models for Perceived Melodic Similarity in A Cappella Flamenco Singing , 2014, ISMIR.

[6]  Thomas Grill,et al.  Boundary Detection in Music Structure Analysis using Convolutional Neural Networks , 2014, ISMIR.

[7]  Darrell Conklin,et al.  Discovery of distinctive patterns in music , 2010, Intell. Data Anal..

[8]  Simon Dixon,et al.  Towards the characterization of singing styles in world music , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Eamonn J. Keogh,et al.  Everything you know about Dynamic Time Warping is Wrong , 2004 .

[10]  M. Jacomy,et al.  ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software , 2014, PloS one.

[11]  José Miguel Díaz-Báñez,et al.  Melodic Contour and Mid-Level Global Features Applied to the Analysis of Flamenco Cantes , 2015, ArXiv.

[12]  José Miguel Díaz-Báñez,et al.  Discovery of repeated vocal patterns in polyphonic audio: A case study on flamenco music , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[13]  L. Marín LA BIMODALIDAD EN LAS FORMAS DEL FANDANGO Y EN LOS CANTES DE LEVANTE: ORIGEN Y EVOLUCIÓN , 2011 .

[14]  Trevor de Clercq,et al.  A corpus analysis of rock harmony , 2011, Popular Music.

[15]  José Miguel Díaz-Báñez,et al.  Towards Flamenco Style Recognition: the Challenge of Modelling the Aficionado , 2016 .

[16]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  C. Spera Flamenco nuevo: Tradition, evolution and innovation , 2010 .

[18]  P. Manuel Evolution and Structure in Flamenco Harmony , 1986 .

[19]  R. Shepard,et al.  Quantification of the hierarchy of tonal functions within a diatonic context. , 1979, Journal of experimental psychology. Human perception and performance.

[20]  Jae-Hun Kim,et al.  Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21]  David Temperley,et al.  Pitch-Class Distribution and the Identification of Key , 2008 .

[22]  N. Kroher,et al.  MELODIC PATTERN CROSS-OCCURRENCES BETWEEN GUITAR FALSETAS AND SINGING VOICE IN FLAMENCO MUSIC , 2017 .

[23]  José Miguel Díaz-Báñez,et al.  Audio-Based Melody Categorization: Exploring Signal Representations and Evaluation Strategies , 2017, Computer Music Journal.

[24]  Emilia Gómez,et al.  Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Yonghong Yan,et al.  Automatic Vocal Segments Detection in Popular Music , 2013, 2013 Ninth International Conference on Computational Intelligence and Security.

[27]  W. Bas de Haas,et al.  A Corpus-Based Study on Ragtime Syncopation , 2013, ISMIR.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[30]  E. Gómez,et al.  Flamenco Music and Its Computational Study , 2018, Mathematical Music Theory.

[31]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[32]  Emilia Gómez,et al.  Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[33]  E. Chew Towards a mathematical model of tonality , 2000 .

[34]  Benjamin Schrauwen,et al.  Deep content-based music recommendation , 2013, NIPS.

[35]  José Miguel Díaz-Báñez,et al.  Unsupervised singing voice detection using dictionary learning , 2016, 2016 24th European Signal Processing Conference (EUSIPCO).