Coarse and fine learning in deep networks

Evolutionary systems such as Learning Classifier Systems (LCS) are able to learn reliably in irregular domains, while Artificial Neural Networks (ANNs) are very successful on problems with an appropriate gradient. This study introduces a novel method for discovering coarse structure, using a technique related to LCS, in combination with gradient descent. The structure used is a deep feature network, with a number of properties of a higher level of abstraction than existing ANNs, for example the network is constructed based on co-occurrence relationships, and maintained as a dynamic population of features. The feature creation technique can be considered a coarse or rapid initialization technique, that constructs a network before subsequent fine-tuning using gradient descent. The process is comparable with, but distinct from, layer-wise pretraining methods that construct and initialize a deep network prior to fine-tuning. The approach we introduce is a general learning technique, with assumptions of the dimensionality of input, and the described method uses convolved features. Results of classification of MNIST images show an average error rate of 0.79% without pre-processing or pretraining, comparable to the benchmark result provided by Restricted Boltzmann Machines of 0.95%, and 0.79% using dropout, however based on a convolutional topology, and as such our system is less general than RBM techniques, but more general than existing convolutional systems because it does not require the same domain assumptions and pre-defined topology. Use of a randomly initialized network provides a much poorer result (1.25%) indicating the coarse learning process plays a significant role. Classification of NORB images is examined, with results comparable to SVM approaches. Development of higher level relationships between features using this approach offers a distinct method of learning using a deep network of features, that can be used in combination with existing techniques.

[1]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[2]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[3]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[4]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Ruimin Shen,et al.  Learning Class-relevant Features and Class-irrelevant Features via a Hybrid third-order RBM , 2011, AISTATS.

[6]  M. Bar Visual objects in context , 2004, Nature Reviews Neuroscience.

[7]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[8]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[9]  Risto Miikkulainen,et al.  Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.

[10]  John R Anderson,et al.  An integrated theory of the mind. , 2004, Psychological review.

[11]  Geoffrey E. Hinton,et al.  3D Object Recognition with Deep Belief Nets , 2009, NIPS.

[12]  J. Tanaka The entry point of face recognition: evidence for face expertise. , 2001, Journal of experimental psychology. General.

[13]  Song-Chun Zhu,et al.  Learning AND-OR Templates for Object Recognition and Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  John H. Holland,et al.  Cognitive systems based on adaptive algorithms , 1977, SGAR.

[16]  Sanja Fidler,et al.  Similarity-based cross-layered hierarchical representation for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Martin V. Butz,et al.  Automated Global Structure Extraction for Effective Local Building Block Processing in XCS , 2006, Evolutionary Computation.

[19]  Pascal Vincent,et al.  Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.

[20]  M. Bar The proactive brain: using analogies and associations to generate predictions , 2007, Trends in Cognitive Sciences.

[21]  T. Poggio,et al.  Hierarchical models of object recognition in cortex September 23 , 1999 , 1999 .

[22]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[25]  Alan D. Blair,et al.  An Abstract Deep Network for Image Classification , 2012, Australasian Conference on Artificial Intelligence.

[26]  Martin V. Butz,et al.  An Algorithmic Description of XCS , 2000, IWLCS.

[27]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[28]  Jan Drugowitsch Design and Analysis of Learning Classifier Systems - A Probabilistic Approach , 2008, Studies in Computational Intelligence.

[29]  Anthony Knittel,et al.  An activation reinforcement based classifier system for balancing generalisation and specialisation (ARCS) , 2010, GECCO '10.

[30]  Jean-Marc Fellous,et al.  Computational models of reinforcement learning: the role of dopamine as a reward signal , 2010, Cognitive Neurodynamics.

[31]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[32]  Martin V. Butz,et al.  An algorithmic description of XCS , 2000, Soft Comput..

[33]  Anthony Knittel Abstract representations in deep networks, capturing rapid and top-down cognitive processes in artificial learning , 2013 .

[34]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[35]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[36]  David E. Goldberg,et al.  The Design of Innovation: Lessons from and for Competent Genetic Algorithms , 2002 .