暂无分享,去创建一个
Wei Wu | Huang Hu | Mi Zhang | Ruozi Huang | Kei Sawada
[1] Otmar Hilliges,et al. Learning Human Motion Models for Long-Term Predictions , 2017, 2017 International Conference on 3D Vision (3DV).
[2] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[3] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[4] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.
[5] Alexander M. Rush,et al. Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.
[6] R. Zatorre,et al. Listening to musical rhythms recruits motor regions of the brain. , 2008, Cerebral cortex.
[7] R. Campbell,et al. Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex , 2000, Current Biology.
[8] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[9] Jung-Woo Ha,et al. Dual Attention Networks for Multimodal Reasoning and Matching , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Ira Kemelmacher-Shlizerman,et al. Audio to Body Dynamics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[12] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[13] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[14] David J. Fleet,et al. Gaussian Process Dynamical Models , 2005, NIPS.
[15] Xi Chen,et al. Stacked Cross Attention for Image-Text Matching , 2018, ECCV.
[16] Dahua Lin,et al. Convolutional Sequence Generation for Skeleton-Based Action Synthesis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[17] Eduardo de Campos Valadares,et al. Dancing to the music , 2000 .
[18] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[19] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[20] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[21] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[22] Janet Adshead-Lansdale,et al. Dance History: An Introduction , 1994 .
[23] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[24] Shinji Watanabe,et al. Weakly-Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).
[25] Martial Hebert,et al. The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Yang Feng,et al. Bridging the Gap between Training and Inference for Neural Machine Translation , 2019, ACL.
[27] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[28] Zhen Zhang,et al. Convolutional Sequence to Sequence Model for Human Dynamics , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[29] Scott Cohen,et al. Forecasting Human Dynamics from Static Images , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[30] Richard Socher,et al. Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[32] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[33] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[34] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[35] Michael J. Black,et al. On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] G. Widmer,et al. MAXIMUM FILTER VIBRATO SUPPRESSION FOR ONSET DETECTION , 2013 .
[37] L E Marks,et al. On the cross-modal perception of intensity. , 1986, Journal of experimental psychology. Human perception and performance.
[38] Navdeep Jaitly,et al. Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Jitendra Malik,et al. Recurrent Network Models for Human Dynamics , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[40] Alexei A. Efros,et al. Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[41] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[42] Minho Lee,et al. Music similarity-based approach to generating dance motion sequence , 2012, Multimedia Tools and Applications.
[43] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[44] Yi Zhou,et al. Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis , 2017, ICLR.
[45] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[46] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[47] Daniel P. W. Ellis,et al. Beat Tracking by Dynamic Programming , 2007 .
[48] Weidong Geng,et al. Example-Based Automatic Music-Driven Conventional Dance Motion Synthesis , 2012, IEEE Transactions on Visualization and Computer Graphics.
[49] Zhe Gan,et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[50] Jamie Ward,et al. Sound-Colour Synaesthesia: to What Extent Does it Use Cross-Modal Mechanisms Common to us All? , 2006, Cortex.
[51] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[52] Yoshua Bengio,et al. Professor Forcing: A New Algorithm for Training Recurrent Networks , 2016, NIPS.
[53] Sebastian Nowozin,et al. Efficient Nonlinear Markov Models for Human Motion , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[54] Jamie Ward,et al. Crossmodal interactions: lessons from synesthesia. , 2006, Progress in brain research.
[55] Jan Kautz,et al. Video-to-Video Synthesis , 2018, NeurIPS.
[56] Silvio Savarese,et al. Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).