Vision-Based Hand Gesture Recognition for Understanding Musical Time Pattern and Tempo

We introduce a method of understanding of four musical time patterns and three tempos that are generated by a human conductor of robot orchestra or an operator of computer- based music play system using the hand gesture recognition. We use only a stereo vision camera with no extra special devices. We suggest a simple and reliable vision-based hand gesture recognition with two naive features. One is the motion-direction code which is a quantized code for motion directions. The other is the conducting feature point (CFP) where the point of sudden motion changes. The proposed hand gesture recognition system operates as follows: First, it extracts the human hand region by segmenting the depth information generated by stereo matching of image sequences. Next, it follows the motion of the center of the gravity(COG) of the extracted hand region and generates the gesture features such as CFP and the direction-code. Finally, we obtain the current timing pattern of beat and tempo of the playing music by the proposed hand gesture recognition using either CFP tracking or motion histogram matching. The experimental results on the test data set show that the musical time pattern and tempo recognition rate is over 86.42% for the motion histogram matching, and 79.75% for the CFP tracking.