Looking beyond sound: Unsupervised analysis of musician videos

In this work, we focus on visual information conveyed by performing musicians. While musicians are playing, their movement relates to their musical performance. As such, analysis of this information can support structural characterization and timeline indexing of a recorded performance, especially in cases when such analyses are not trivially computed from the musical audio. We propose an unsupervised visual analysis method, in which visual novelty is inferred from motion orientation histograms of regions of interest. Considering our method in a case study on audiovisually recorded jam sessions, we show that our analysis of the visual channel yields promising and meaningful performance-related information, including information complementary to the audio channel.

[1]  Marcelo M. Wanderley,et al.  Performance Gestures of Musicians: What Structural and Emotional Information Do They Convey? , 2003, Gesture Workshop.

[2]  Jim Tørresen,et al.  Analyzing sound tracings: a multimodal approach to music information retrieval , 2011, MIRUM '11.

[3]  Tuomas Eerola,et al.  Modeling musical attributes to characterize ensemble recordings using rhythmic audio features , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Alexander Refsum Jensenius,et al.  Body Movement in Music Information Retrieval , 2009, ISMIR.

[5]  Marcelo M. Wanderley,et al.  Segmenting and Parsing Instrumentalists' Gestures , 2012 .

[6]  Gary R. Bradski,et al.  Motion segmentation and pose recognition with motion history gradients , 2000, Proceedings Fifth IEEE Workshop on Applications of Computer Vision.

[7]  Jonathan Foote,et al.  Automatic audio segmentation using a measure of audio novelty , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[8]  Noel E. O'Connor,et al.  Visual analysis for drum sequence transcription , 2007, 2007 15th European Signal Processing Conference.

[9]  Gaël Richard,et al.  Automatic transcription of drum sequences using audiovisual features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[10]  James W. Davis Recognizing Movement using Motion Histograms , 1999 .

[11]  Marcelo M. Wanderley,et al.  Quantitative Analysis of Non-obvious Performer Gestures , 2001, Gesture Workshop.