Similarity Learning Networks for Animal Individual Re-Identification - Beyond the Capabilities of a Human Observer

Deep learning has become the standard methodology to approach computer vision tasks when large amounts of labeled data are available. One area where traditional deep learning approaches fail to perform is one-shot learning tasks where a model must correctly classify a new category after seeing only one example. One such domain is animal re-identification, an application of computer vision which can be used globally as a method to automate species population estimates from camera trap images. Our work demonstrates both the application of similarity comparison networks to animal re-identification, as well as the capabilities of deep convolutional neural networks to generalize across domains. Few studies have considered animal re-identification methods across species. Here, we compare two similarity comparison methodologies: Siamese and Triplet-Loss, based on the AlexNet, VGG-19, DenseNet201, MobileNetV2, and InceptionV3 architectures considering mean average precision (mAP)@1 and mAP@5. We consider five data sets corresponding to five different species: humans, chimpanzees, humpback whales, fruit flies, and Siberian tigers, each with their own unique set of challenges. We demonstrate that Triplet Loss outperformed its Siamese counterpart for all species. Without any species-specific modifications, our results demonstrate that similarity comparison networks can reach a performance level beyond that of humans for the task of animal re-identification. The ability for researchers to re-identify an animal individual upon re-encounter is fundamental for addressing a broad range of questions in the study of population dynamics and community/behavioural ecology. Our expectation is that similarity comparison networks are the beginning of a major trend that could stand to revolutionize animal re-identification from camera trap data.

[1]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[2]  Sixue Gong,et al.  Face Recognition: Primates in the Wild , 2018, 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[3]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[5]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[6]  Amit K. Roy-Chowdhury,et al.  Re-Identification in the Function Space of Feature Warps , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Joachim Denzler,et al.  Towards Automated Visual Monitoring of Individual Gorillas in the Wild , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[8]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[9]  Erin M. Bayne,et al.  REVIEW: Wildlife camera trapping: a review and recommendations for linking surveys to ecological processes , 2015 .

[10]  Peter Godfrey-Smith,et al.  A second site occupied by Octopus tetricus at high densities, with notes on their ecology and behavior , 2017 .

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Andrew Gilman,et al.  Individual Common Dolphin Identification Via Metric Embedding Learning , 2018, 2018 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[13]  K. U. Karanth,et al.  A tiger cannot change its stripes: using a three-dimensional model to match images of living tigers and tiger skins , 2009, Biology Letters.

[14]  Graham W. Taylor,et al.  Can Drosophila melanogaster tell who’s who? , 2018, bioRxiv.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[17]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[18]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  David De Roure,et al.  Zooniverse: observing the world's largest citizen science platform , 2014, WWW.

[20]  Rebecca J. Foster,et al.  A critique of density estimation from camera-trap data† , 2012 .

[21]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[24]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Alexander Loos,et al.  An automated chimpanzee identification system using face detection and recognition , 2013, EURASIP J. Image Video Process..

[26]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[27]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  P. Meek,et al.  On the Reliability of Expert Identification of Small-Medium Sized Mammals from Camera Trap Photos , 2013 .

[29]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Alberto Del Bimbo,et al.  Person Re-Identification by Iterative Re-Weighted Sparse Ranking , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Graham W. Taylor,et al.  Deep Learning Object Detection Methods for Ecological Camera Trap Data , 2018, 2018 15th Conference on Computer and Robot Vision (CRV).

[32]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[33]  Joachim Denzler,et al.  Chimpanzee Faces in the Wild: Log-Euclidean CNNs for Predicting Identities and Attributes of Primates , 2016, GCPR.

[34]  Peter P. Gash,et al.  Automated marine turtle photograph identification using artificial neural networks, with application to green turtles , 2014 .

[35]  Margaret Kosmala,et al.  Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning , 2017, Proceedings of the National Academy of Sciences.

[36]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[37]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[40]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[41]  Tanya Y. Berger-Wolf,et al.  An Animal Detection Pipeline for Identification , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[42]  Samuel T. Turvey,et al.  Estimating animal density using camera traps without the need for individual recognition , 2008 .

[43]  Margaret Kosmala,et al.  Automatically identifying wild animals in camera trap images with deep learning , 2017, ArXiv.