Automatic Raag Classification of Pitch-tracked Performances Using Pitch-class and Pitch-class Dyad Distributions

A system was constructed to automatically identify raags using pitch-class (PCDs) and pitch-class dyad distributions (PCDDs) derived from pitch-tracked performances. Classification performance was 94% in a 10-fold cross-validation test with 17 target raags. Using PCDs alone, performance was 75%, and 82% using only PCDDs. Best performance was attained using a maximum a posteriori (MAP) rule with a multivariate normal (MVN) likelihood model. Each raag was divided into non-overlapping 30 second segments and pitch tracked using the Harmonic Product Spectrum (HPS) algorithm. Pitch tracks were then transformed into pitch-class sequences by segmenting into notes using a complex-domain detection function. For each note, pitch-class was determined by taking the mode of the detected pitches from the onset of the note to the next onset. For the given tuning, the nearest pitch was found based on a just-intoned chromatic scale. The comparison was made in the log-frequency domain. PCDs and PCDDs were estimated from each segment leading to 12 PCD features and 144 PCDD features. Thus, each segment was represented by a 156-dimensional feature vector, representing the relative frequency of pitch-classes and pitch dyads. It was found that performance improved significantly (+15%) when principal component analysis was used to reduce the feature vector dimension to 50. The study suggests that PCDs and PCDDs may be effective features for raag classification. However, the database size must be expanded in size and diversity to confirm this more generally.

[1]  Craig Stuart Sapp Visual hierarchical key analysis , 2005, CIE.

[2]  C. Krumhansl,et al.  Tonal hierarchies in the music of north India. , 1984, Journal of experimental psychology. General.

[3]  M. Pearce,et al.  Sweet Anticipation : Music and the Psychology of Expectation , 2007 .

[4]  Xuejing Sun A pitch determination algorithm based on subharmonic-to-harmonic ratio , 2000, INTERSPEECH.

[5]  Monojit Choudhury,et al.  FINITE STATE MODELS FOR GENERATION OF HINDUSTANI CLASSICAL MUSIC , 2004 .

[6]  Emilia Gómez,et al.  Estimating The Tonality Of Polyphonic Audio Files: Cognitive Versus Machine Learning Modelling Strategies , 2004, ISMIR.

[7]  R. Shepard,et al.  Quantification of the hierarchy of tonal functions within a diatonic context. , 1979, Journal of experimental psychology. Human perception and performance.

[8]  C. Krumhansl Cognitive Foundations of Musical Pitch , 1990 .

[9]  Chaitanya Mishra,et al.  TANSEN: A System for Automatic Raga Identification , 2003, IICAI.

[10]  Parag Chordia Automatic rag classification using spectrally derived tone profiles , 2004, ICMC.

[11]  Ching-Hua Chuan,et al.  Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[12]  Craig Stuart Sapp,et al.  Efficient Pitch Detection Techniques for Interactive Music , 2001, ICMC.

[13]  Matthew E. P. Davies,et al.  A Combined Phase and Amplitude Based Approach to Onset Detection for Audio Segmentation , 2003 .