Audio Key Finding: Considerations in System Design and Case Studies on Chopin's 24 Preludes

We systematically analyze audio key finding to determine factors important to system design, and the selection and evaluation of solutions. First, we present a basic system, fuzzy analysis spiral array center of effect generator algorithm, with three key determination policies: nearest-neighbor (NN), relative distance (RD), and average distance (AD). AD achieved a 79% accuracy rate in an evaluation on 410 classical pieces, more than 8% higher RD and NN. We show why audio key finding sometimes outperforms symbolic key finding. We next propose three extensions to the basic key finding system—the modified spiral array (mSA), fundamental frequency identification (F0), and post-weight balancing (PWB)—to improve performance, with evaluations using Chopin's Preludes (Romantic repertoire was the most challenging). F0 provided the greatest improvement in the first 8 seconds, while mSA gave the best performance after 8 seconds. Case studies examine when all systems were correct, or all incorrect.

[1]  E. Chew Modeling Tonality: Applications to Music Cognition , 2001 .

[2]  Emilia Gómez,et al.  Estimating The Tonality Of Polyphonic Audio Files: Cognitive Versus Machine Learning Modelling Strategies , 2004, ISMIR.

[3]  E. Chew Towards a mathematical model of tonality , 2000 .

[4]  Mark Steedman,et al.  On Interpreting Bach , 1987 .

[5]  Steffen Pauws,et al.  Musical key extraction from audio , 2004, ISMIR.

[6]  Carol L. Krumhansl Quantifying tonal hierarchies and key distances , 2001 .

[7]  Elaine Chew Foreword to special issue on music visualization , 2005, CIE.

[8]  Ching-Hua Chuan,et al.  AUDIO KEY FINDING USING FACEG : FUZZY ANALYSIS WITH THE CEG ALGORITHM , 2005 .

[9]  David Temperley,et al.  What's Key for Key? The Krumhansl-Schmuckler Key-Finding Algorithm Reconsidered , 1999 .

[10]  A.P. Klapuri,et al.  A perceptually motivated multiple-F0 estimation method , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[11]  Anssi Klapuri,et al.  Multiple fundamental frequency estimation based on harmonicity and spectral smoothness , 2003, IEEE Trans. Speech Audio Process..

[12]  Ching-Hua Chuan,et al.  Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[13]  Özgür Izmirli,et al.  Template Based Key Finding from audio , 2005, ICMC.

[14]  ChewElaine,et al.  Interactive multi-scale visualizations of tonal evolution in MuSA.RT Opus 2 , 2005 .

[15]  Emmanuel Vincent,et al.  The 2005 Music Information retrieval Evaluation Exchange (MIREX 2005): Preliminary Overview , 2005, ISMIR.

[16]  Özgür,et al.  AN ALGORITHM FOR AUDIO KEY FINDING , 2005 .

[17]  Elaine Chew,et al.  Real-Time Pitch Spelling Using the Spiral Array , 2005, Computer Music Journal.

[18]  Elaine Chew,et al.  Mapping Midi to the Spiral Array: Disambiguating Pitch Spellings , 2003 .

[19]  Ching-Hua Chuan,et al.  Fuzzy Analysis in Pitch-Class Determination for Polyphonic Audio Key Finding , 2005, ISMIR.