论文信息 - Deep Learning-Based Violin Bowing Action Recognition

Deep Learning-Based Violin Bowing Action Recognition

We propose a violin bowing action recognition system that can accurately recognize distinct bowing actions in classical violin performance. This system can recognize bowing actions by analyzing signals from a depth camera and from inertial sensors that are worn by a violinist. The contribution of this study is threefold: (1) a dataset comprising violin bowing actions was constructed from data captured by a depth camera and multiple inertial sensors; (2) data augmentation was achieved for depth-frame data through rotation in three-dimensional world coordinates and for inertial sensing data through yaw, pitch, and roll angle transformations; and, (3) bowing action classifiers were trained using different modalities, to compensate for the strengths and weaknesses of each modality, based on deep learning methods with a decision-level fusion process. In experiments, large external motions and subtle local motions produced from violin bow manipulations were both accurately recognized by the proposed system (average accuracy > 80%).

[1] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[2] Rafael Ramírez,et al. Air violin: a machine learning approach to fingering gesture recognition , 2017, MIE@ICMI.

[3] Lihi Zelnik-Manor,et al. Statistical analysis of dynamic actions , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Qiang Chen,et al. Network In Network , 2013, ICLR.

[5] Zafar Ali Khan,et al. Abnormal human activity recognition system based on R-transform and kernel discriminant technique for elderly home care , 2011, IEEE Transactions on Consumer Electronics.

[6] Rafael Ramírez,et al. Bowing Gestures Classification in Violin Performance: A Machine Learning Approach , 2019, Front. Psychol..

[7] Qian Du,et al. Local Binary Patterns and Extreme Learning Machine for Hyperspectral Imagery Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[8] Pushmeet Kohli,et al. Fusion4D , 2016, ACM Trans. Graph..

[9] Kai-Lung Hua,et al. Baseball Player Behavior Classification System Using Long Short-Term Memory with Multimodal Features , 2019, Sensors.

[10] James E. Fowler,et al. Decision Fusion in Kernel-Induced Spaces for Hyperspectral Image Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[11] Billur Barshan,et al. Activity Recognition Invariant to Sensor Orientation with Wearable Motion Sensors , 2017, Sensors.

[12] Nasser Kehtarnavaz,et al. A Real-Time Human Action Recognition System Using Depth and Inertial Sensor Fusion , 2016, IEEE Sensors Journal.

[13] Nasser Kehtarnavaz,et al. Data Augmentation in Deep Learning-Based Fusion of Depth and Inertial Sensing for Action Recognition , 2019, IEEE Sensors Letters.

[14] Jing Zhang,et al. Action Recognition From Depth Maps Using Deep Convolutional Neural Networks , 2016, IEEE Transactions on Human-Machine Systems.

[15] Hans-Peter Seidel,et al. VNect , 2017, ACM Trans. Graph..

[16] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[17] Renqiang Xie,et al. Accelerometer-Based Hand Gesture Recognition by Neural Network and Similarity Matching , 2016, IEEE Sensors Journal.

[18] Andrew W. Fitzgibbon,et al. Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[19] Ping Luo,et al. Towards Understanding Regularization in Batch Normalization , 2018, ICLR.

[20] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21] Tanima Dutta,et al. A Continuous Hand Gestures Recognition Technique for Human-Machine Interaction Using Accelerometer and Gyroscope Sensors , 2016, IEEE Sensors Journal.