Siamese Networks for Few-Shot Learning on Edge Embedded Devices

Edge artificial intelligence hardware targets mainly inference networks that have been pretrained on massive datasets. The field of few-shot learning looks for methods that allow a network to produce high accuracy even when only a few samples of each class are available. Siamese networks can be used to tackle few-shot learning problems and are unique because they do not require retraining on the new samples of the new classes. Therefore they are suitable for edge hardware accelerators which often do not include on-chip training capabilities. This work describes improvements to a baseline Siamese network and benchmarking of the improved network on edge platforms. The modifications to the baseline network included adding multi-resolution kernels, a hybrid training process as well a different embedding similarity computation method. This network shows an average accuracy improvement of up to 22% across 4 datasets in a 5-way, 1-shot classification task. Benchmarking results using three edge computing platforms (NVIDIA Jetson Nano, Coral Edge TPU and a custom convolutional neural network accelerator) show that a Siamese classifier can run on these devices at reasonable frame rates for real-time performance, between 3 frames per second (FPS) on Jetson Nano and 60 FPS on the Edge TPU. By increasing the weight sparsity during training, the inference time of a network with 25% weight sparsity increases by 10 FPS but with only 1% drop in accuracy.

[1]  Martial Hebert,et al.  Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Jose Javier Gonzalez Ortiz,et al.  What is the State of Neural Network Pruning? , 2020, MLSys.

[3]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[4]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[5]  Xukan Ran,et al.  Deep Learning With Edge Computing: A Review , 2019, Proceedings of the IEEE.

[6]  Tobi Delbrück,et al.  Fast event-driven incremental learning of hand symbols , 2019, 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS).

[7]  Alessandro Aimar,et al.  NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[9]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[10]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Eriko Nurvitadhi,et al.  WRPN: Wide Reduced-Precision Networks , 2017, ICLR.

[13]  Luca Benini,et al.  Origami: A 803-GOp/s/W Convolutional Network Accelerator , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Roman P. Pflugfelder,et al.  Siamese Learning Visual Tracking: A Survey , 2017, ArXiv.

[15]  Jinlu Liu,et al.  Fast and Generalized Adaptation for Few-Shot Learning , 2019, ArXiv.

[16]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[17]  Shih-Chii Liu,et al.  Multi-Resolution Siamese Networks for One-Shot Learning , 2020, 2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS).

[18]  Shaoli Liu,et al.  Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Yi Yang,et al.  A Discriminatively Learned CNN Embedding for Person Reidentification , 2016, ACM Trans. Multim. Comput. Commun. Appl..

[20]  Xiaogang Wang,et al.  Person Re-identification with Deep Similarity-Guided Graph Neural Network , 2018, ECCV.

[21]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[22]  Michael McCloskey,et al.  Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem , 1989 .

[23]  William J. Dally,et al.  SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[24]  Shih-Chii Liu,et al.  Event-driven Pipeline for Low-latency Low-compute Keyword Spotting and Speaker Verification System , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Eugenio Culurciello,et al.  Snowflake: An efficient hardware accelerator for convolutional neural networks , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[26]  Hamed Tabkhi,et al.  Real-Time Person Re-identification at the Edge: A Mixed Precision Approach , 2019, ICIAR.

[27]  Cordelia Schmid,et al.  End-to-End Incremental Learning , 2018, ECCV.

[28]  Amos J. Storkey,et al.  Augmenting Image Classifiers Using Data Augmentation Generative Adversarial Networks , 2018, ICANN.

[29]  Christoph H. Lampert,et al.  iCaRL: Incremental Classifier and Representation Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Seungjin Choi,et al.  Gradient-Based Meta-Learning with Learned Layerwise Metric and Subspace , 2018, ICML.

[31]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[32]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[34]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[35]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[36]  Deyu Meng,et al.  Co-Saliency Detection via a Self-Paced Multiple-Instance Learning Framework , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Tobi Delbrück,et al.  Live demonstration: Convolutional neural network driven by dynamic vision sensor playing RoShamBo , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[38]  Tobi Delbrück,et al.  Steering a predator robot using a mixed frame/event-driven convolutional neural network , 2016, 2016 Second International Conference on Event-based Control, Communication, and Signal Processing (EBCCSP).

[39]  Avik Bhattacharya,et al.  Siamese graph convolutional network for content based remote sensing image retrieval , 2019, Comput. Vis. Image Underst..

[40]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Gang Wang,et al.  Gated Siamese Convolutional Neural Network Architecture for Human Re-identification , 2016, ECCV.

[43]  Albert Gordo,et al.  End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.

[44]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[45]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[46]  Bing Liu,et al.  Siamese Convolutional Neural Networks for Remote Sensing Scene Classification , 2019, IEEE Geoscience and Remote Sensing Letters.

[47]  Steve B. Furber,et al.  Robustness of spiking Deep Belief Networks to noise and reduced bit precision of neuro-inspired hardware platforms , 2015, Front. Neurosci..

[48]  Guang Yang,et al.  SaliencyGAN: Deep Learning Semisupervised Salient Object Detection in the Fog of IoT , 2020, IEEE Transactions on Industrial Informatics.