A Comprehensive Study on Torchvision Pre-trained Models for Fine-grained Inter-species Classification

This study aims to explore different pre-trained models offered in the Torchvision package which is available in the PyTorch library. And investigate their effectiveness on fine-grained images classification. Transfer Learning is an effective method of achieving extremely good performance with insufficient training data. In many real-world situations, people cannot collect sufficient data required to train a deep neural network model efficiently. Transfer Learning models are pre-trained on a large data set, and can bring a good performance on smaller datasets with significantly lower training time. Torchvision package offers us many models to apply the Transfer Learning on smaller datasets. Therefore, researchers may need a guideline for the selection of a good model. We investigate Torchvision pre-trained models on four different data sets: 10 Monkey Species, 225 Bird Species, Fruits 360, and Oxford 102 Flowers. These data sets have images of different resolutions, class numbers, and different achievable accuracies. We also apply their usual fully-connected layer and the Spinal fully-connected layer to investigate the effectiveness of SpinalNet. The Spinal fully-connected layer brings better performance in most situations. We apply the same augmentation for different models for the same data set for a fair comparison. This paper may help future Computer Vision researchers in choosing a proper Transfer Learning model.

[1]  H. M. D. Kabir,et al.  Uncertainty-aware Decisions in Cloud Computing: Foundations and Future Directions , 2020 .

[2]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3]  Saeid Nahavandi,et al.  High-Fidelity Hexarot Simulation-Based Motion Platform Using Fuzzy Incremental Controller and Model Predictive Control-Based Motion Cueing Algorithm , 2020, IEEE Systems Journal.

[4]  Suiyang Khoo,et al.  A Linear Time-Varying Model Predictive Control-Based Motion Cueing Algorithm for Hexapod Simulation-Based Motion Platform , 2021, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Saeid Nahavandi,et al.  SpinalNet: Deep Neural Network with Gradual Input , 2020, ArXiv.

[6]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[7]  Saeid Nahavandi,et al.  Neural Network-Based Uncertainty Quantification: A Survey of Methodologies and Applications , 2018, IEEE Access.

[8]  Hilal Tayara,et al.  SpineNet-6mA: A Novel Deep Learning Tool for Predicting DNA N6-Methyladenine Sites in Genomes , 2020, IEEE Access.

[9]  Mihai Oltean,et al.  Fruit recognition from images using deep learning , 2017, Acta Universitatis Sapientiae, Informatica.

[10]  et al.,et al.  Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.

[11]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[12]  Saeid Nahavandi,et al.  A Model Predictive Control-Based Motion Cueing Algorithm with Consideration of Joints’ limitations for Hexapod Motion Platform , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[13]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16]  Saeid Nahavandi,et al.  Partial Adversarial Training for Neural Network-Based Uncertainty Quantification , 2021, IEEE Transactions on Emerging Topics in Computational Intelligence.

[17]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Alex Krizhevsky,et al.  One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.

[20]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[21]  Saad Mekhilef,et al.  Application of extreme learning machine for short term output power forecasting of three grid-connected PV systems , 2017 .

[22]  Saeid Nahavandi,et al.  Prepositioning of a Land Vehicle Simulation-Based Motion Platform Using Fuzzy Logic and Neural Network , 2020, IEEE Transactions on Vehicular Technology.

[23]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[24]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Li Liu,et al.  A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges , 2020, Inf. Fusion.

[26]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[27]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[30]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Saeid Nahavandi,et al.  A New Gantry-Tau-Based Mechanism Using Spherical Wrist and Model Predictive Control-Based Motion Cueing Algorithm , 2019, Robotica.

[32]  Saeid Nahavandi,et al.  Optimal Uncertainty-guided Neural Network Training , 2019, Appl. Soft Comput..

[33]  Chee Peng Lim,et al.  A Model Predictive Control-based Motion Cueing Algorithm using an optimized Nonlinear Scaling for Driving Simulators , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).

[34]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[35]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[36]  Bo Chen,et al.  MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Saeid Nahavandi,et al.  A novel axis symmetric parallel mechanism with coaxial actuated arms , 2018, 2018 4th International Conference on Control, Automation and Robotics (ICCAR).

[38]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[39]  Saeid Nahavandi,et al.  Performance Evaluation and Calibration of Gantry-Tau Parallel Mechanism , 2019, Iranian Journal of Science and Technology, Transactions of Mechanical Engineering.