Recent years have witnessed a substantial increase in the deep learning (DL) architectures proposed for visual recognition tasks like person re-identification, where individuals must be recognized over multiple distributed cameras. Although these architectures have greatly improved the state-of-the-art accuracy, the computational complexity of the convolutional neural networks (CNNs) commonly used for feature extraction remains an issue, hindering their deployment on platforms with limited resources, or in applications with real-time constraints. There is an obvious advantage to accelerating and compressing DL models without significantly decreasing their accuracy. However, the source (pruning) domain differs from operational (target) domains, and the domain shift between image data captured with different non-overlapping camera viewpoints leads to lower recognition accuracy. In this paper, we investigate the prunability of these architectures under different design scenarios. This paper first revisits pruning techniques that are suitable for reducing the computational complexity of deep CNN networks applied to person re-identification. Then, these techniques are analyzed according to their pruning criteria and strategy and according to different scenarios for exploiting pruning methods to fine-tuning networks to target domains. Experimental results obtained using DL models with ResNet feature extractors, and multiple benchmarks re-identification datasets, indicate that pruning can considerably reduce network complexity while maintaining a high level of accuracy. In scenarios where pruning is performed with large pretraining or fine-tuning datasets, the number of FLOPS required by ResNet architectures is reduced by half, while maintaining a comparable rank-1 accuracy (within 1% of the original model). Pruning while training a larger CNNs can also provide a significantly better performance than fine-tuning smaller ones.
[1]
Shengcai Liao,et al.
Perceptual hash-based feature description for person re-identification
,
2018,
Neurocomputing.
[2]
Michael S. Bernstein,et al.
ImageNet Large Scale Visual Recognition Challenge
,
2014,
International Journal of Computer Vision.
[3]
Gian Luca Foresti,et al.
Deep Pyramidal Pooling With Attention for Person Re-Identification
,
2020,
IEEE Transactions on Image Processing.
[4]
Liang Lin,et al.
Deep feature learning with relative distance comparison for person re-identification
,
2015,
Pattern Recognit..
[5]
Hantao Yao,et al.
Deep Representation Learning With Part Loss for Person Re-Identification
,
2017,
IEEE Transactions on Image Processing.