Deep Neural Networks with Parallel Autoencoders for Learning Pairwise Relations: Handwritten Digits Subtraction

Modelling relational data is a common task for many machine learning problems. In this work, we focus on learning pairwise relations between two entities, with deep neural networks. To incorporate the structural properties in the data that represent two entities concatenated together, two separate stacked autoencoders are introduced in parallel to extract individual features, which are then fed into a deep neural network for classification. The method is applied to a specific problem: whether two input handwritten digits differ by one. We tested the performance on a dataset generated from handwritten digits in MNIST, which is a widely used dataset for testing different machine learning techniques and pattern recognition methods. We compared with several different machine learning algorithms, including logistic regression and support vector machines, on this handwritten digit subtraction (HDS) dataset. The results showed that deep neural networks outperformed other methods, and in particular, the deep neural networks fitted with two separate autoencoders in parallel increased the prediction accuracy from 85.83%, which was achieved by a standard neural network with a single stacked autoencoder, to 88.27%.

[1]  Thomas Hofmann,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2007 .

[2]  Guang R. Gao,et al.  Discriminating transmembrane proteins from signal peptides using SVM-Fisher approach , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).

[3]  Derek C. Rose,et al.  Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.

[4]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[5]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Lise Getoor,et al.  Learning statistical models from relational data , 2011, SIGMOD '11.

[8]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Lars Kai Hansen,et al.  Infinite multiple membership relational modeling for complex networks , 2011, 2011 IEEE International Workshop on Machine Learning for Signal Processing.

[10]  Pedro M. Domingos,et al.  Learning Tractable Statistical Relational Models , 2014, StarAI@AAAI.

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[13]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Cheng-Yuan Liou,et al.  Modeling word perception using the Elman network , 2008, Neurocomputing.

[16]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[17]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[18]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[19]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[20]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[21]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.