Analysis of instance selection algorithms on large datasets with Deep Convolutional Neural Networks

Deep Convolutional Neural networks (ConvNets) have achieved impressive results in several applications of computer vision and speech processing. With the availability of a large training set, it is common to find that the set contains useless samples (instances), either redundant or noisy. The process of removing these instances is called instance selection in the machine learning field. This paper evaluates the effectiveness of reducing the number of instances in the training set on deep (ConvNets) using a variety of instance selection methods to speed up the training time. We then study how these methods impact on classification accuracy. Moreover, many instance selection methods require a long running time for obtaining a representative subset of the dataset, especially if the dataset is large with high dimensionality. One of the popular algorithms in instance selection is Random Mutation Hill Climbing (RMHC). We propose a new approach in order to make RMHC work much faster with the same accuracy compared to original RMHC.

[1]  Marek Grochowski,et al.  Simple Incremental Instance Selection Wrapper for Classification , 2012, ICAISC.

[2]  Zixing Zhang,et al.  An Agreement and Sparseness-based Learning Instance Selection and its Application to Subjective Speech Phenomena , 2014, LREC 2014.

[3]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[4]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[5]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[6]  Ting Liu,et al.  Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[7]  Tara N. Sainath,et al.  Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[10]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Philip K. Chan,et al.  An Analysis of Instance Selection for Neural Networks to Improve Training Speed , 2014, 2014 13th International Conference on Machine Learning and Applications.

[12]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[13]  Huan Liu,et al.  Instance Selection and Construction for Data Mining , 2001 .

[14]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[15]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  José Francisco Martínez Trinidad,et al.  A review of instance selection methods , 2010, Artificial Intelligence Review.

[17]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[18]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[19]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[22]  Yann LeCun,et al.  Convolutional neural networks applied to house numbers digit classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[23]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[24]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[25]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .