Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos
暂无分享,去创建一个
Hermann Ney | Oscar Koller | Necati Cihan Camgoz | Richard Bowden | N. C. Camgoz | R. Bowden | H. Ney | Oscar Koller
[1] Samy Bengio,et al. An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition , 2002, NIPS.
[2] Maja Pantic,et al. End-to-End Multi-View Lipreading , 2017, BMVC.
[3] Mubarak Shah,et al. Discovering Motion Primitives for Unsupervised Grouping and One-Shot Learning of Human Actions, Gestures, and Expressions , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Nicolas Pugeault,et al. Reading the signs: A video based sign dictionary , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[5] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[6] Luc Van Gool,et al. Efficient Mining of Frequent and Distinctive Feature Configurations , 2007, 2007 IEEE 11th International Conference on Computer Vision.
[7] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Hermann Ney,et al. Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Xiaojin Zhu,et al. Semi-Supervised Learning Literature Survey , 2005 .
[10] Charles Markham,et al. Weakly Supervised Training of a Sign Language Recognition System Using Multiple Instance Learning Density Matrices , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[11] Hermann Ney,et al. RASR - The RWTH Aachen University Open Source Speech Recognition Toolkit , 2011 .
[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[13] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[14] Martin J. Russell,et al. Integrating audio and visual information to provide highly robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[15] Oscar Koller,et al. SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[16] Hermann Ney,et al. Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition , 2016, BMVC.
[17] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[18] Hermann Ney,et al. Read My Lips: Continuous Signer Independent Weakly Supervised Viseme Recognition , 2014, ECCV.
[19] Hermann Ney,et al. Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..
[20] Roger K. Moore,et al. Hidden Markov model decomposition of speech and noise , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[21] Dimitris N. Metaxas,et al. Handshapes and Movements: Multiple-Channel American Sign Language Recognition , 2003, Gesture Workshop.
[22] Ali Farhadi,et al. Aligning ASL for Statistical Translation Using a Discriminative Word Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[23] Alex Pentland,et al. Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[24] Stephen Cox,et al. Some statistical issues in the comparison of speech recognition algorithms , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[25] Changshui Zhang,et al. Recurrent Convolutional Neural Networks for Continuous Sign Language Recognition by Staged Optimization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Sudeep Sarkar,et al. Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[27] Rachel McKee,et al. The Online Dictionary of New Zealand Sign Language , 2017 .
[28] Dimitris N. Metaxas,et al. Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[29] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .
[30] Hermann Ney,et al. Deep Learning of Mouth Shapes for Sign Language , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).
[31] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[32] Dimitris N. Metaxas,et al. Handshapes and movements: Multiple-channel ASL recognition , 2004 .
[33] Matthew Brand,et al. Coupled hidden Markov models for modeling interacting processes , 1997 .
[34] Georg Heigold,et al. GMM-Free DNN Training , 2014 .
[35] Hermann Ney,et al. Modality Combination Techniques for Continuous Sign Language Recognition , 2013, IbPRIA.
[36] Hideki Nakayama,et al. Multimodal Gesture Recognition Using Multi-stream Recurrent Neural Network , 2015, PSIVT.
[37] Hervé Bourlard,et al. Multi-Stream Speech Recognition , 1996 .
[38] Karl-Friedrich Kraiss,et al. Recent developments in visual sign language recognition , 2008, Universal Access in the Information Society.
[39] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[40] Johan A. du Preez,et al. Audio-Visual Speech Recognition using SciPy , 2010 .
[41] Hermann Ney,et al. Improving Continuous Sign Language Recognition: Speech Recognition Techniques and System Design , 2013, SLPAT.
[42] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[43] Wen Gao,et al. A Parallel Multistream Model for Integration of Sign Language Recognition and Lip Motion , 2000, ICMI.
[44] Hervé Bourlard,et al. Using multiple time scales in a multi-stream speech recognition system , 1997, EUROSPEECH.
[45] Andrew Zisserman,et al. Learning sign language by watching TV (using weakly aligned subtitles) , 2009, CVPR.
[46] Chong Wang,et al. Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[47] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Ying Wu,et al. Self-supervised learning for object recognition based on kernel discriminant-EM algorithm , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[49] Hermann Ney,et al. Deep Sign: Enabling Robust Statistical Continuous Sign Language Recognition via Hybrid CNN-HMMs , 2018, International Journal of Computer Vision.
[50] Jussi Kangasharju,et al. The use of meta-HMM in multistream HMM training for automatic speech recognition , 1998, ICSLP.
[51] Mohamed Jemni,et al. Towards a 3D Signing Avatar from SignWriting Notation , 2012, ICCHP.
[52] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[53] Hermann Ney,et al. Extensions of the Sign Language Recognition and Translation Corpus RWTH-PHOENIX-Weather , 2014, LREC.
[54] Hermann Ney,et al. Neural Sign Language Translation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[55] A. Nakamura,et al. Nature (London , 1975 .
[56] Hermann Ney,et al. Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data is Continuous and Weakly Labelled , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[57] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[58] Hermann Ney,et al. RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus , 2012, LREC.
[59] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[60] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[61] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[62] Marcus Liwicki,et al. A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks , 2007 .
[63] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[64] Harriet J. Nock,et al. Modelling asynchrony in automatic speech recognition using loosely coupled hidden Markov models , 2002, Cogn. Sci..
[65] Ronald J. Williams,et al. Gradient-based learning algorithms for recurrent networks and their computational complexity , 1995 .
[66] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[67] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[68] Petros Maragos,et al. Product-HMMs for automatic sign language recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[69] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[70] Hermann Ney,et al. May the force be with you: Force-aligned signwriting for automatic subunit annotation of corpora , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[71] Jonathan G. Fiscus,et al. Tools for the analysis of benchmark speech recognition tests , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[72] Helen Cooper,et al. Learning signs from subtitles: A weakly supervised approach to sign language recognition , 2009, CVPR.
[73] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[74] Hermann Ney,et al. Enhancing gloss-based corpora with facial features using active appearance models , 2013 .
[75] Hermann Ney,et al. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers , 2015, Comput. Vis. Image Underst..