Abstract. Nowadays deep learning has been intensively in spotlight owing to its great victories at major competitions, which undeservedly pushed ‘shallow’ machine learning methods, relatively naive/handy algorithms commonly used by industrial engineers, to the background in spite of their facilities such as small requisite amount of time/dataset for training. We, with a practical point of view, utilized shallow learning algorithms to construct a learning pipeline such that operators can utilize machine learning without any special knowledge, expensive computation environment, and a large amount of labelled data. The proposed pipeline automates a whole classification process, namely feature-selection, weighting features and the selection of the most suitable classifier with optimized hyperparameters. The configuration facilitates particle swarm optimization, one of well-known metaheuristic algorithms for the sake of generally fast and fine optimization, which enables us not only to optimize (hyper)parameters but also to determine appropriate features/classifier to the problem, which has conventionally been a priori based on domain knowledge and remained untouched or dealt with naive algorithms such as grid search. Through experiments with the MNIST and CIFAR-10 datasets, common datasets in computer vision field for character recognition and object recognition problems respectively, our automated learning approach provides high performance considering its simple setting (i.e. non-specialized setting depending on dataset), small amount of training data, and practical learning time. Moreover, compared to deep learning the performance stays robust without almost any modification even with a remote sensing object recognition problem, which in turn indicates that there is a high possibility that our approach contributes to general classification problems.
[1]
Bart De Moor,et al.
Easy Hyperparameter Search Using Optunity
,
2014,
ArXiv.
[2]
Janez Demsar,et al.
Statistical Comparisons of Classifiers over Multiple Data Sets
,
2006,
J. Mach. Learn. Res..
[3]
M. Friedman.
The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance
,
1937
.
[4]
David Picard,et al.
Evaluation of second-order visual features for land-use classification
,
2014,
2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).
[5]
Yoshua Bengio,et al.
Gradient-based learning applied to document recognition
,
1998,
Proc. IEEE.
[6]
Geoffrey E. Hinton,et al.
Learning to Detect Roads in High-Resolution Aerial Images
,
2010,
ECCV.
[7]
Supratik Mukhopadhyay,et al.
DeepSat: a learning framework for satellite imagery
,
2015,
SIGSPATIAL/GIS.
[8]
R. E. Lee,et al.
Distribution-free multiple comparisons between successive treatments
,
1995
.
[9]
Leo Breiman,et al.
Random Forests
,
2001,
Machine Learning.
[10]
Ruben Martinez-Cantin,et al.
BayesOpt: a Bayesian optimization library for nonlinear optimization, experimental design and bandits
,
2014,
J. Mach. Learn. Res..
[11]
Yann LeCun,et al.
Regularization of Neural Networks using DropConnect
,
2013,
ICML.
[12]
Yoshua Bengio,et al.
An empirical evaluation of deep architectures on problems with many factors of variation
,
2007,
ICML '07.
[13]
Emmanuelle Gouillart,et al.
scikit-image: image processing in Python
,
2014,
PeerJ.
[14]
Corinna Cortes,et al.
Support-Vector Networks
,
1995,
Machine Learning.