Learning Deep Neural Networks for High Dimensional Output Problems

State-of-the-art pattern recognition methods have difficulties dealing with problems where the dimension of the output space is large. In this article, we propose a framework based on deep architectures (e. g. Deep Neural Networks) in order to deal with this issue. Deep architectures have proven to be efficient for high dimensional input problems such as image classification, due to their ability to embed the input space. The main contribution of this article is the extension of the embedding procedure to both the input and output spaces to easily handle complex outputs. Using this extension, inter-output dependencies can be modelled efficiently. This provides an interesting alternative to probabilistic models such as HMM and CRF. Preliminary experiments on toy datasets and USPS character reconstruction show promising results.

[1]  Juha Karhunen,et al.  Principal component neural networks — Theory and applications , 1998, Pattern Analysis and Applications.

[2]  Christopher M. Brown,et al.  The theory and practice of Bayesian image labeling , 1990, International Journal of Computer Vision.

[3]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[4]  Thierry Paquet,et al.  A Markovian Approach for Handwritten Document Segmentation , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[5]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[6]  Yuan Qi,et al.  Contextual recognition of hand-drawn diagrams with conditional random fields , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[7]  SchwartzRichard,et al.  An Algorithm that Learns Whats in a Name , 1999 .

[8]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[9]  Bernhard Schölkopf,et al.  Kernel Dependency Estimation , 2002, NIPS.

[10]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[11]  Mounim A. El-Yacoubi,et al.  A Statistical Approach for Phrase Location and Recognition within a Text Line: An Application to Street Name Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  David G. Stork,et al.  Pattern Classification , 1973 .

[13]  Jianhua Wang,et al.  Coupling CRFs and Deformable Models for 3D Medical Image Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[15]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[16]  Jason Weston,et al.  Deep learning via semi-supervised embedding , 2008, ICML '08.

[17]  Lawrence R. Rabiner,et al.  A tutorial on Hidden Markov Models , 1986 .