Stacked multichannel autoencoder – an efficient way of learning from synthetic data

Learning from synthetic data has many important applications in case where sufficient amounts of labeled data are not available. Using synthetic data is challenging due to differences in feature distributions between synthetic and actual data, a phenomenon we term synthetic gap. In this paper, we investigate and formalize a general framework – Stacked Multichannel Autoencoder (SMCAE) that enables bridging the synthetic gap and learning from synthetic data more efficiently. In particular, we show that our SMCAE can not only transform and use synthetic data on a challenging face-sketch recognition task, but that it can also help simulate real images which can be used for training classifiers for recognition. Preliminary experiments validate the effectiveness of the proposed framework.

[1]  Joost van de Weijer,et al.  Regularized Multi-Concept MIL for weakly-supervised facial behavior categorization , 2014, BMVC.

[2]  Kilian Q. Weinberger,et al.  Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.

[3]  T. Pridmore,et al.  UvA-DARE Expression-Invariant Age Estimation Expression-Invariant Age Estimation , 2014 .

[4]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[5]  Ethem Alpaydin,et al.  Combining multiple representations and classifiers for pen-based handwritten digit recognition , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[8]  Kilian Q. Weinberger,et al.  Feature hashing for large scale multitask learning , 2009, ICML '09.

[9]  Xi Zhang,et al.  Alignment of 3D Building Models with Satellite Images Using Extended Chamfer Matching , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Tamás VARGA,et al.  Effects of Training Set Expansion in Handwriting Recognition Using Synthetic Data , 2003 .

[11]  Amit R.Sharma,et al.  Face Photo-Sketch Synthesis and Recognition , 2012 .

[12]  Shiguang Shan,et al.  Deeply Coupled Auto-encoder Networks for Cross-view Classification , 2014, ArXiv.

[13]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[14]  Qian-Yi Zhou,et al.  Fast and extensible building modeling from airborne LiDAR data , 2008, GIS '08.

[15]  Erik Marchi,et al.  Sparse Autoencoder-Based Feature Transfer Learning for Speech Emotion Recognition , 2013, 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction.

[16]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[17]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[18]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[19]  Ophir Frieder,et al.  Interactive degraded document enhancement and ground truth generation , 2008, Electronic Imaging.

[20]  Bernt Schiele,et al.  Learning people detection models from few training samples , 2011, CVPR 2011.

[21]  王晓刚,et al.  Coupled Information-Theoretic Encoding for Face Photo-Sketch Recognition , 2011 .

[22]  Horst Bunke,et al.  Comparing natural and synthetic training data for off-line cursive handwriting recognition , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[23]  Paul A. Viola,et al.  Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[24]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Miroslav Kubat,et al.  Combining Subclassifiers in Text Categorization: A DST-Based Solution and a Case Study , 2007, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ling Shao,et al.  Boosted Cross-Domain Categorization , 2014, BMVC.

[27]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Kate Saenko,et al.  From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[29]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[30]  Shiguang Shan,et al.  Multi-View Discriminant Analysis , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[32]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[33]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[34]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[35]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..