Animal species recognition in the wildlife based on muzzle and shape features using joint CNN

Abstract Monitoring of animal behavior in the wild supposes the reliable techniques for their species recognition using, mainly, visual data captured by camera traps. In this paper, we propose to extent Convolutional Neural Network (CNN) VGG by three branches, two of which are VGG16 for the muzzle and part of shape recognition and one is VGG19 for the whole shape recognition. A necessity of such branched CNN structure is caused by great variety of the animal poses fixed by a camera trap. Also, here we met with an objective problem of the unbalanced dataset due to different behavior of animals in nature. Preliminary categorization procedure of images helps to obtain better recognition results. Experiments were conducted using the dataset obtained from Ergaki national park, Krasnoyarsky Kray, Russia, 2012-2018. The joint CNN shows good accuracy results on the balanced dataset achieving 80.6% Top-1 and 94.1% Top-5, respectively. In the case of the unbalanced training dataset, we obtained 38.7% Top-1 and 54.8% Top-5 accuracy.