Rhythm-aware Sequence-to-sequence Learning for Labanotation Generation with Gesture-sensitive Graph Convolutional Encoding

Labanotation is a professional dance notation system widely used in dance education and choreography preservation. Automatically generating Labanotation dance scores from motion capture data can save a huge amount of manual time and effort. Recently, the sequence-to-sequence (seq2seq) model is applied to the automatic Labanotation generation. This model is based on an encoder-decoder structure, which encodes the input motion sequence to a fixed-length vector and then decodes it to generate the target sequence. However, the encoding of spatial skeleton structure of motion data is not considered in the existing work. Besides, it is challenging to align between the input motion data and the output Laban symbol sequences due to the severe imbalance of sequence lengths. Therefore, in this paper, we present a new seq2seq model for more effective Labanotation generation. In the encoder, we propose a new gesture-sensitive graph convolutional network with learned adaptive joint weights and non-physical connections to learn both spatial and temporal patterns from motion data sequences. In the decoder, we exploit motion rhythm information and propose a novel rhythm-aware attention mechanism to learn a good alignment between motion sequences and Laban symbol sequences, so that we can focus on relevant parts of the input motion sequence without searching in the whole sequence when predicting a target Laban symbol. Extensive experiments on two real-world datasets show that the proposed method achieves a better performance compared with the state-of-the-art approaches on the task of automatic Labanotation generation.