Application of SHAZAM-Based Audio Fingerprinting for Multilingual Indian Song Retrieval

Extracting film songs from a multilingual database based on a query clip is a challenging task. The challenge stems from the subtle variations in pitch and rhythm, which accompany the change in the singer’s voice, style, and orchestration, change in language and even a change in gender. The fingerprinting algorithm must be designed to capture the base tune in the composition and not the adaptations (or variations which include lyrical modifications and changes in the singer’s voice). The SHAZAM system was developed for capturing cover audio pieces from millions of Western songs stored in the database, with the objective of tapping into the melodic construct of the song (devoid of other forms of embellishments). When applied to the Indian database the system was found less effective, due to subtle changes in both rhythm and melody mainly due to the semiclassical nature of Indian film songs. The retrieval accuracy was found to be 85 %. Potential reasons for the failure of this SHAZAM system have been discussed with examples.

[1]  Emanuele Pollastri Melody-Retrieval based on Pitch-Tracking and String-Matching Methods∗ , 1999 .

[2]  Naveen Kumar,et al.  Features for comparing tune similarity of songs across different languages , 2012, 2012 IEEE 14th International Workshop on Multimedia Signal Processing (MMSP).

[3]  Shrikanth S. Narayanan,et al.  Dynamic chroma feature vectors with applications to cover song identification , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[4]  Panagiotis Papapetrou,et al.  Benchmarking dynamic time warping for music retrieval , 2010, PETRA '10.

[5]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[6]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[7]  S.D. Roy,et al.  Note Onset Detection in Natural Humming , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[8]  Mohan S. Kankanhalli,et al.  Similarity matching of continuous melody contours for humming querying of melody databases , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[9]  R. Sukanesh,et al.  Comparison of image compression by minimum relative entropy (MRE), DCT, structured soft (max-min) decision tree and restoration through weighted region growing techniques for x-ray & ultrasonic medical images , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[10]  Emilia Gómez,et al.  Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond , 2010, Advances in Music Information Retrieval.

[11]  K. R. Ramakrishnan,et al.  An onset detection algorithm for query by humming (QBH) applications using psychoacoustic knowledge , 2009, 2009 17th European Signal Processing Conference.

[12]  Avery Wang,et al.  The Shazam music recognition service , 2006, CACM.

[13]  Pedro Cano,et al.  A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..

[14]  Hsin-Min Wang,et al.  Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval , 2008, J. Inf. Sci. Eng..