Video-Based Vibrato Detection and Analysis for Polyphonic String Music

In music performance, vibrato is an important artistic effect, where slight variations in pitch are introduced to add expressiveness and warmth. Automatic vibrato detection and analysis, although well studied for monophonic music, has rarely been explored for polyphonic music, because of the challenge in multi-pitch analysis. We propose a video-based approach for detecting and analyzing vibrato in polyphonic string music. Specifically, we capture the fine motion of the left hand of string players through optical flow analysis of video frames. We explore two methods. The first uses a feature extraction and SVM classification pipeline, and the second is an unsupervised technique based on autocorrelation analysis of the principal motion component. The proposed methods are compared with audio-only methods applied to individual instrument tracks separated from original audio mixture using the score. Experiments show that the proposed video-based methods achieve a significantly higher vibrato detection accuracy than the audio-based methods especially in high polyphony cases. Further experiments also demonstrate the utility of the approach in vibrato rate and extent analysis.

[1]  Bryan Pardo,et al.  Soundprism: An Online System for Score-Informed Source Separation of Music Audio , 2011, IEEE Journal of Selected Topics in Signal Processing.

[2]  Anders Friberg,et al.  CUEX: An algorithm for automatic extraction of expressive tone parameters in music performance from acoustic signals , 2007 .

[3]  Hung-Yan Gu,et al.  Mandarin singing voice synthesis using ANN vibrato parameter models , 2008, ICMLC 2008.

[4]  David Hsu,et al.  Digital violin tutor: an integrated system for beginning violin learners , 2005, ACM Multimedia.

[5]  Michael J. Black,et al.  Secrets of optical flow estimation and their principles , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Jakob Abeßer,et al.  Score-Informed Analysis of Tuning, Intonation, Pitch Modulation, and Dynamics in Jazz Solos , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[7]  Elaine Chew,et al.  The filter diagonalisation method for music signal analysis: frame-wise vibrato detection and estimation , 2017 .

[8]  Jakob Abeßer,et al.  Score-Informed Analysis of Intonation and Pitch Modulation in Jazz Solos , 2015, ISMIR.

[9]  Jordi Bonada,et al.  Bird Song Synthesis Based on Hidden Markov Models , 2016, INTERSPEECH.

[10]  John M. Geringer,et al.  Perceived Pitch of Violin and Cello Vibrato Tones Among Music Majors , 2010 .

[11]  Meinard Müller,et al.  Template-Based Vibrato Analysis in Music Signals , 2016 .

[12]  Gaurav Sharma,et al.  Visually informed multi-pitch analysis of string ensembles , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Harvey Fletcher,et al.  Quality of Violin Vibrato Tones , 1967 .

[14]  E. Rainbow,et al.  A Pilot Study of Performance Practices of Twentieth-Century Musicians , 1974 .

[15]  Simon Dixon,et al.  PYIN: A fundamental frequency estimator using probabilistic threshold distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Ricardo Sousa,et al.  Accurate analysis and visual feedback of vibrato in singing , 2012, 2012 5th International Symposium on Communications, Control and Signal Processing.

[17]  Hanna Järveläinen Perception-based control of vibrato parameters in string instrument synthesis , 2002, ICMC.

[18]  Jyh-Shing Roger Jang,et al.  Singing Pitch Extraction by Voice Vibrato / Tremolo Estimation and Instrument Partial Deletion , 2010, ISMIR.

[19]  Axel Röbel,et al.  Vibrato Detection Using Cross Correlation Between Temporal Energy and Fundamental Frequency , 2011 .

[20]  Bochen Li,et al.  AUDIO-VISUAL SOURCE ASSOCIATION FOR STRING ENSEMBLES THROUGH MULTI-MODAL VIBRATO ANALYSIS , 2017 .

[21]  Ana M. Barbancho,et al.  Transcription and expressiveness detection system for violin music , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Gaurav Sharma,et al.  See and listen: Score-informed association of sound tracks to players in chamber music performance videos , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[23]  Masataka Goto,et al.  An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features , 2006, INTERSPEECH.

[24]  Kyogu Lee,et al.  Harmonic-Percussive Source Separation Using Harmonicity and Sparsity Constraints , 2015, ISMIR.

[25]  James Paul Mick,et al.  An analysis of double bass vibrato: Rates, widths, and pitches as influenced by pitch height, fingers used, and tempo , 2012 .