Probabilistic sequence models for image sequence processing and recognition

This PhD thesis investigates the image sequence labeling problems optical character recognition (OCR), object tracking, and automatic sign language recognition (ASLR). To address these problems we investigate which concepts and ideas can be adopted from speech recognition to these problems. For each of these tasks we propose an approach that is centered around the approaches known from speech recognition and adapted to the problem at hand. In particular, we describe our hidden Markov model (HMM) based image sequence recognition system which has been adopted from a large vocabulary continuous speech recognition (LVCSR) framework and extended for tasks. For OCR, we present our RWTH Aachen University Optical Character Recognition (RWTH OCR) system, which has been developed within the scope of this thesis work. We analyze simple appearance-based features in combination with complex training algorithms. Detailed discussions about discriminative features, discriminative training, and a novel discriminative confidence-based unsupervised adaption approach are presented. In automatic sign language recognition (ASLR), we adapt the RWTH Aachen University Speech Recognition (RWTH ASR) framework to account for multiple modalities important in sign language communication, e.g. hand configuration, place of articulation, hand movement, and hand orientation. Additionally, non-manual components like facial expression and body posture are analyzed. Most sign language relevant features require a robust tracking method. We propose a multi purpose model-free object tracking framework which is based on dynamic programming (DP), and which is applied to hand and head tracking tasks in automatic sign language recognition (ASLR). In particular, a context-dependent tracking decision optimization over time allows to robustly track occluded objects. The algorithm is inspired by the time alignment algorithm in speech recognition, which guarantees to find the optimal path w.r.t. a given criterion and prevents taking possibly wrong local decisions. All results in this work are either evaluated on standard benchmark databases, or on novel publicly available databases generated within the scope of this thesis work. Our optical character recognition (OCR) system is evaluated on various handwritten benchmark databases and for multiple languages. Additionally, a novel Arabic machine printed newspaper database is presented and used for evaluation. Our dynamic programming tracking (DPT) framework and its different algorithms are evaluated for head and hand tracking in sign languages on more than 120,000 frames of annotated ground-truth data. The ASLR system is evaluated for multiple sign languages, such as American Sign Language (ASL), Deutsche Gebärdensprache (DGS), and Nederlandse Gebaren Taal (NGT), on databases of different visual complexity. In all cases highly competitive results can be achieved, partly outperforming all other approaches known from literature.

[1]  Michiel Bacchiani,et al.  Confidence scores for acoustic model adaptation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[3]  Thomas Deselaers,et al.  Image retrieval, object recognition, and discriminative models , 2008 .

[4]  Onno Crasborn,et al.  The Corpus NGT: An online corpus for professionals and laymen , 2008 .

[5]  Daniel Povey,et al.  Discriminative training for HMM-based offline handwritten character recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[6]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[8]  Hermann Ney,et al.  The SignSpeak Project - Bridging the Gap Between Signers and Speakers , 2010, LREC.

[9]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Hermann Ney,et al.  Deformation-Aware Log-Linear Models , 2009, DAGM-Symposium.

[11]  Xin Liu,et al.  Real Time Large Vocabulary Continuous Sign Language Recognition Based on OP/Viterbi Algorithm , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[12]  Hermann Ney,et al.  The Fast and the Flexible: Extended Pseudo Two-Dimensional Warping for Face Recognition , 2011, IbPRIA.

[13]  Hermann Ney,et al.  Adaptation in statistical pattern recognition using tangent vectors , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Ralf Schlüter,et al.  Investigations on discriminative training criteria , 2000 .

[15]  Vittorio Ferrari,et al.  Better Appearance Models for Pictorial Structures , 2009, BMVC.

[16]  Ruiduo Yang,et al.  Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Windridge,et al.  A Linguistic Feature Vector for the Visual Interpretation of Sign Language , 2004, ECCV.

[18]  Mike Wald,et al.  An Arabic Sign Language Corpus for Instructional Language in School , 2010 .

[19]  Sudeep Sarkar,et al.  The humanID gait challenge problem: data sets, performance, and analysis , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Daniel Povey,et al.  Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[23]  Geoffrey Zweig,et al.  LATTICE-BASED UNSUPERVISED MLLR FOR SPEAKER ADAPTATION , 2000 .

[24]  Richard M. Schwartz,et al.  Robust language-independent OCR system , 1999, Other Conferences.

[25]  Brian Kingsbury,et al.  Boosted MMI for model and feature-space discriminative training , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[27]  Wei Du,et al.  Video analysis for continuous sign language recognition , 2010 .

[28]  H. Ney,et al.  Adapting the RWTH-OCR Handwriting Recognition System to French Handwriting , 2009 .

[29]  Alex Pentland,et al.  Discriminative, generative and imitative learning , 2002 .

[30]  Petros Maragos,et al.  DICTA-SIGN: Sign Language Recognition, Generation and Modelling with application in Deaf Communication , 2010 .

[31]  Gernot A. Fink,et al.  On the Use of Context-Dependent Modeling Units for HMM-Based Offline Handwriting Recognition , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[32]  Georg Heigold,et al.  Confidence-Based Discriminative Training for Model Adaptation in Offline Arabic Handwriting Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[33]  Hermann Ney,et al.  Pan, zoom, scan — Time-coherent, trained automatic video cropping , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Paul A. Viola,et al.  Text recognition of low-resolution document images , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[35]  Philippe Dreuw,et al.  Face Recognition using Distortion Models , 2009 .

[36]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[37]  Andrew Zisserman,et al.  Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts , 2008, BMVC.

[38]  Bernt Schiele,et al.  Model-free tracking of cars and people based on color regions , 2006, Image Vis. Comput..

[39]  Hermann Ney,et al.  SURF-Face: Face Recognition Under Viewpoint Consistency Constraints , 2009, BMVC.

[40]  Alain Biem,et al.  Maximization of mutual information for offline Thai handwriting recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Adel M. Alimi,et al.  A New Arabic Printed Text Image Database and Evaluation Protocols , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[42]  Patrick Pérez,et al.  Data fusion for visual tracking with particles , 2004, Proceedings of the IEEE.

[43]  Björn Schuller,et al.  Supporting Multi Camera Tracking by Monocular Deformable Graph Tracking , 2009 .

[44]  Venu Govindaraju,et al.  Offline Arabic handwriting recognition: a survey , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Thomas Schaaf,et al.  Estimating confidence using word lattices , 1997, EUROSPEECH.

[46]  Luc Van Gool,et al.  Online Multiperson Tracking-by-Detection from a Single, Uncalibrated Camera , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Hermann Ney,et al.  Improved MLLR speaker adaptation using confidence measures for conversational speech recognition , 2000, INTERSPEECH.

[48]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[49]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[50]  Luc Van Gool,et al.  Beyond semi-supervised tracking: Tracking should be as simple as detection, but not simpler than recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[51]  Helen Cooper,et al.  Learning signs from subtitles: A weakly supervised approach to sign language recognition , 2009, CVPR.

[52]  Takeo Kanade,et al.  Stereo by Intra- and Inter-Scanline Search Using Dynamic Programming , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Hermann Ney,et al.  Benchmark Databases for Video-Based Automatic Sign Language Recognition , 2008, LREC.

[54]  Ben Taskar,et al.  Adaptive pose priors for pictorial structures , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[55]  Verónica Romero,et al.  Combination of N-Grams and Stochastic Context-Free Grammars in an Offline Handwritten Recognition System , 2007, IbPRIA.

[56]  Tony P. Pridmore,et al.  Building a multi-modal Arabic corpus (MMAC) , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[57]  Bernhard Schölkopf,et al.  Improving the Accuracy and Speed of Support Vector Machines , 1996, NIPS.

[58]  Xia Liu,et al.  Sign recognition using depth image streams , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[59]  Georg Heigold,et al.  Modified MPE/MMI in a transducer-based framework , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[60]  Yiming Yang,et al.  Modified Logistic Regression: An Approximation to SVM and Its Applications in Large-Scale Text Categorization , 2003, ICML.

[61]  Jonathan Le Roux,et al.  Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[62]  Rogério Schmidt Feris,et al.  The isometric self-organizing map for 3D hand pose estimation , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[63]  Tasos Anastasakos,et al.  The use of confidence measures in unsupervised adaptation of speech recognizers , 1998, ICSLP.

[64]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[65]  Gerhard Rigoll,et al.  Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition , 2008, DAGM-Symposium.

[66]  Hermann Ney,et al.  The use of a one-stage dynamic programming algorithm for connected word recognition , 1984 .

[67]  Andrew Blake,et al.  Probabilistic Fusion of Stereo with Color and Contrast for Bi-Layer Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68]  Hermann Ney,et al.  Spoken language processing techniques for sign language recognition and translation , 2008, Technology and Disability.

[69]  Hermann Ney,et al.  Learning of Variability for Invariant Statistical Pattern Recognition , 2001, ECML.

[70]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Georg Heigold,et al.  The 2006 RWTH parliamentary speeches transcription system , 2006, INTERSPEECH.

[72]  Hermann Ney,et al.  Enhancing a Sign Language Translation System with Vision-Based Features , 2009, Gesture Workshop.

[73]  Rohit Prasad,et al.  Improvements in hidden Markov model based Arabic OCR , 2008, 2008 19th International Conference on Pattern Recognition.

[74]  Hermann Ney,et al.  Constrained Energy Minimization for Matching-Based Image Recognition , 2010, 2010 20th International Conference on Pattern Recognition.

[75]  Ali Farhadi,et al.  Transfer Learning in Sign language , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[76]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[77]  Karl-Friedrich Kraiss,et al.  Towards an Automatic Sign Language Recognition System Using Subunits , 2001, Gesture Workshop.

[78]  E. A. Hendriks,et al.  3 D Visual Detection of Correct NGT Sign Production , 2007 .

[79]  Gernot A. Fink,et al.  Unsupervised Estimation of Writing Style Models for Improved Unconstrained Off-line Handwriting Recognition , 2006 .

[80]  Hermann Ney,et al.  Towards automatic learning in LVCSR: rapid development of a Persian broadcast transcription system , 2008, INTERSPEECH.

[81]  Hermann Ney,et al.  An Integrated Tracking And Recognition Approach For Video , 2008 .

[82]  Mark J. F. Gales,et al.  Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..

[83]  Hynek Hermansky,et al.  TRAPS - classifiers of temporal patterns , 1998, ICSLP.

[84]  Erik G. Learned-Miller,et al.  Learning on the Fly: Font-Free Approaches to Difficult OCR Problems , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[85]  Hermann Ney,et al.  Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[86]  Alexander H. Waibel,et al.  Dictionary learning for spontaneous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[87]  Sudeep Sarkar,et al.  Automated extraction of signs from continuous sign language sentences using Iterated Conditional Modes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[88]  Hermann Ney,et al.  Two-Dimensional Warping for Image Recognition , 2011 .

[89]  Gerhard Rigoll,et al.  Novel Hybrid NN/HMM Modelling Techniques for On-line Handwriting Recognition , 2006 .

[90]  Daniel Martin Keysers,et al.  Modeling of image variability for recognition , 2006, Ausgezeichnete Informatikdissertationen.

[91]  H. Ney,et al.  Optimization of Hidden Markov Models and Neural Networks , 2012 .

[92]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..

[93]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[94]  Scott K. Liddell,et al.  A Segmental Framework for Representing Signs Phonetically , 2011 .

[95]  Andy Way,et al.  The ATIS Sign Language Corpus , 2008, LREC.

[96]  Horst Bunke,et al.  Hidden Markov model length optimization for handwriting recognition systems , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[97]  Samy Bengio,et al.  Writer adaptation techniques in HMM based Off-Line Cursive Script Recognition , 2002, Pattern Recognit. Lett..

[98]  Annelies Braffort ARGo: An Architecture for Sign Language Recognition and Interpretation , 1996, Gesture Workshop.

[99]  Andrew Zisserman,et al.  Learning sign language by watching TV (using weakly aligned subtitles) , 2009, CVPR.

[100]  H. Ney,et al.  INTERDEPENDENCE OF LANGUAGE MODELS AND DISCRIMINATIVE TRAINING , 2007 .

[101]  Hermann Ney,et al.  Morpho-syntax Based Statistical Methods for Sign Language Translation vorgelegt von : Cand , 2006 .

[102]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[103]  Hermann Ney,et al.  Tracking using dynamic programming for appearance-based sign language recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[104]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[105]  S. Chen,et al.  Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[106]  Georg Heigold,et al.  Discriminative HMMS, log-linear models, and CRFS: What is the difference? , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[107]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[108]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[109]  H. Ney,et al.  Modeling Image Variability in Appearance-Based Gesture Recognition , 2006 .

[110]  David A. Forsyth,et al.  Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[111]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[112]  Hermann Ney,et al.  Writer Adaptive Training and Writing Variant Model Refinement for Offline Arabic Handwriting Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[113]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[114]  Thierry Paquet,et al.  Handwritten Document Analysis for Automatic Writer Recognition , 2005 .

[115]  Georg Heigold,et al.  Modified MMI/MPE: a direct evaluation of the margin in speech recognition , 2008, ICML '08.

[116]  Hermann Ney,et al.  Speech recognition techniques for a sign language recognition system , 2007, INTERSPEECH.

[117]  Andrew W. Fitzgibbon,et al.  Learning priors for calibrating families of stereo cameras , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[118]  Jakob Uszkoreit,et al.  Large Scale Parallel Document Mining for Machine Translation , 2010, COLING.

[119]  Morteza Zahedi,et al.  Robust appearance based sign language recognition , 2007 .

[120]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[121]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[122]  Hermann Ney,et al.  Best practice for sign language data collections regarding the needs of data-driven recognition and translation , 2010, LREC 2010.

[123]  Wen Gao,et al.  Large-Vocabulary Continuous Sign Language Recognition Based on Transition-Movement Models , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[124]  Hermann Ney,et al.  Warp that smile on your face: Optimal and smooth deformations for face recognition , 2011, Face and Gesture 2011.

[125]  Philippe C. Cattin,et al.  Tracking the invisible: Learning where the object might be , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[126]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[127]  Hermann Ney,et al.  Smoothed Disparity Maps for Continuous American Sign Language Recognition , 2009, IbPRIA.

[128]  Karl-Friedrich Kraiss,et al.  Towards a Video Corpus for Signer-Independent Continuous Sign Language Recognition , 2007 .

[129]  Georg Heigold,et al.  On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields , 2007, INTERSPEECH.

[130]  Ramin Zabih,et al.  Dynamic Programming and Graph Algorithms in Computer Vision , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[131]  Manuel Blum,et al.  reCAPTCHA: Human-Based Character Recognition via Web Security Measures , 2008, Science.

[132]  James M. Rehg,et al.  Learning the basic units in American Sign Language using discriminative segmental feature selection , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[133]  I K Fodor,et al.  A Survey of Dimension Reduction Techniques , 2002 .

[134]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[135]  David Suter,et al.  Adaptive Object Tracking Based on an Effective Appearance Filter , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[136]  Karl-Friedrich Kraiss,et al.  Video-based sign recognition using self-organizing subunits , 2002, Object recognition supported by user interaction for service robots.

[137]  Philippe Dreuw Continuous Sign Language Recognition Approaches from Speech Recognition , 2006 .

[138]  Hermann Ney,et al.  Tracking Benchmark Databases for Video-Based Sign Language Recognition , 2010, ECCV Workshops.

[139]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[140]  Georg Heigold,et al.  The RWTH aachen university open source speech recognition system , 2009, INTERSPEECH.

[141]  Fabio Valente,et al.  Hierarchical neural networks feature extraction for LVCSR system , 2007, INTERSPEECH.

[142]  Sabri A. Mahmoud,et al.  Printed Arabic text database (PATDB) for research and benchmarking , 2010 .

[143]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[144]  Michael Filhol,et al.  Sign Language Corpora for Analysis, Processing and Evaluation , 2010, LREC.

[145]  P. Maragos,et al.  Data-Driven Sub-Units and Modeling Structure for Continuous Sign Language Recognition with Multiple Cues , 2010 .

[146]  Hermann Ney,et al.  Efficient approximations to model-based joint tracking and recognition of continuous sign language , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[147]  H. Ney,et al.  Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[148]  Gerhard Rigoll,et al.  Optimizing the Number of States for HMM-Based On-line Handwritten Whiteboard Recognition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[149]  H. Grabner,et al.  Is Pedestrian Detection Really a Hard Task ? ∗ , 2007 .

[150]  Gernot A. Fink,et al.  Markov models for offline handwriting recognition: a survey , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[151]  Wen Gao,et al.  Re-sampling for Chinese Sign Language Recognition , 2005, Gesture Workshop.

[152]  Erik G. Learned-Miller,et al.  Improving state-of-the-art OCR through high-precision document-specific modeling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[153]  András Zolnay,et al.  Acoustic feature combination for speech recognition , 2006 .

[154]  Chafic Mokbel,et al.  Arabic handwriting recognition using baseline dependant features and hidden Markov modeling , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[155]  Yann LeCun,et al.  Transformation Invariance in Pattern Recognition - Tangent Distance and Tangent Propagation , 2012, Neural Networks: Tricks of the Trade.

[156]  Vuokko Vuori,et al.  Clustering writing styles with a self-organizing map , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[157]  Shamik Sural,et al.  Gait Recognition in the Presence of Occlusion: A New Dataset and Baseline Algorithms , 2011 .

[158]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[159]  Georg Heigold,et al.  A log-linear discriminative modeling framework for speech recognition , 2010 .

[160]  Rachid Deriche,et al.  A Review of Statistical Approaches to Level Set Segmentation: Integrating Color, Texture, Motion and Shape , 2007, International Journal of Computer Vision.

[161]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[162]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[163]  Georg Heigold,et al.  Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[164]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[165]  Petros Maragos,et al.  Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition , 2011, CVPR 2011 WORKSHOPS.

[166]  Johansson. Stig,et al.  Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers , 1978 .

[167]  Hermann Ney,et al.  Hierarchical hybrid MLP/HMM or rather MLP features for a discriminatively trained Gaussian HMM: A comparison for offline handwriting recognition , 2011, 2011 18th IEEE International Conference on Image Processing.

[168]  Hermann Ney,et al.  Visual Modeling and Feature Adaptation in Sign Language Recognition , 2011 .

[169]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[170]  Christian Cuxac,et al.  The Creagest Project: a Digitized and Annotated Corpus for French Sign Language (LSF) and Natural Gestural Languages , 2010, LREC.

[171]  Prem Natarajan,et al.  Portable Language-Independent Adaptive Translation from OCR , 2008 .

[172]  Horst Bunke,et al.  Hidden Markov model-based ensemble methods for offline handwritten text line recognition , 2008, Pattern Recognit..

[173]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[174]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[175]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[176]  Georg Heigold,et al.  Margin-Based Discriminative Training for String Recognition , 2010, IEEE Journal of Selected Topics in Signal Processing.

[177]  Yi Lu,et al.  Machine printed character segmentation --; An overview , 1995, Pattern Recognit..

[178]  H. Ney,et al.  Enhancements for local feature based image classification , 2004, ICPR 2004.

[179]  Rama Chellappa,et al.  Online Empirical Evaluation of Tracking Algorithms , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[180]  Abdolhossein Sarrafzadeh,et al.  An adaptive real-time skin detector based on Hue thresholding: A comparison on two motion tracking methods , 2006, Pattern Recognit. Lett..

[181]  Daniel Schneider,et al.  Rapid Signer Adaptation for Isolated Sign Language Recognition , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[182]  Salvador España Boquera,et al.  Improving Offline Handwritten Text Recognition with Hybrid HMM/ANN Models , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[183]  Ali Farhadi,et al.  Aligning ASL for Statistical Translation Using a Discriminative Word Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[184]  Hermann Ney,et al.  White-space models for offline Arabic handwriting recognition , 2008, 2008 19th International Conference on Pattern Recognition.

[185]  Agnès Just,et al.  Hand Posture Classification and Recognition using the Modified Census Transform , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[186]  Leonid Pishchulin Matching Algorithms for Image Recognition , 2010 .

[187]  Volker Märgner,et al.  Improvement of Arabic handwriting recognition systems; combination and/or reject? , 2009, Electronic Imaging.

[188]  Alain Biem,et al.  Minimum classification error training for online handwriting recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[189]  Marc-Peter Schambach Model length adaptation of an HMM based cursive word recognition system , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[190]  Dimitris N. Metaxas,et al.  A Framework for Recognizing the Simultaneous Aspects of American Sign Language , 2001, Comput. Vis. Image Underst..

[191]  Alfons Juan-Císcar,et al.  Windowed Bernoulli Mixture HMMs for Arabic Handwritten Word Recognition , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[192]  Edouard Geoffrois,et al.  Results of the RIMES Evaluation Campaign for Handwritten Mail Processing , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[193]  Dietrich Klakow,et al.  Robustness of linear discriminant analysis in automatic speech recognition , 2002, Object recognition supported by user interaction for service robots.

[194]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.