Assembly output codes for learning neural networks

Neural network-based classifiers usually encode the class labels of input data via a completely disjoint code, i.e. a binary vector with only one bit associated with each category. We use coding theory to propose assembly codes where each element is associated with several classes, making for better target vectors. These codes emulate the combination of several classifiers, which is a well-known method to improve decision accuracy. Our experiments on data-sets such as MNIST with a multi-layer neural network show that assembly output codes, which are characterized by a higher minimum Hamming distance, result in better classification performance. These codes are also well suited to the use of clustered clique-based networks in category representation.

[1]  Takahiro Watanabe,et al.  Document Analysis and Recognition , 1999, Communications in Computer and Information Science.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Venu Govindaraju,et al.  Review of Classifier Combination Methods , 2008, Machine Learning in Document Analysis and Recognition.

[4]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[5]  Vincent Gripon,et al.  Sparse Neural Networks With Large Learning Diversity , 2011, IEEE Transactions on Neural Networks.

[6]  Claude Berrou,et al.  Storing Sparse Messages in Networks of Neural Cliques , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Mohammad Ali Bagheri,et al.  Error correcting output codes for multiclass classification: Application to two image vision problems , 2012, The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012).

[8]  Yoshua Bengio,et al.  Deconstructing the Ladder Network Architecture , 2015, ICML.

[9]  Jesús Cid-Sueiro,et al.  A model selection algorithm for a posteriori probability estimation with neural networks , 2005, IEEE Transactions on Neural Networks.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[12]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[13]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.