How Trustworthy are the Existing Performance Evaluations for Basic Vision Tasks?

Performance evaluation is indispensable to the advancement of machine vision, yet its consistency and rigour have not received proportionate attention. This paper examines performance evaluation criteria for basic vision tasks namely, object detection, instance-level segmentation and multi-object tracking. Specifically, we advocate the use of criteria that are (i) consistent with mathematical requirements such as the metric properties, (ii) contextually meaningful in sanity tests, and (iii) robust to hyper-parameters for reliability. We show that many widely used performance criteria do not fulfill these requirements. Moreover, we explore alternative criteria for detection, segmentation, and tracking, using metrics for sets of shapes, and assess them against these requirements.

[1]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Ronald P. S. Mahler,et al.  Multitarget miss distance via optimal assignment , 2004, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[3]  Andrew M. Wallace,et al.  Development of a N-type GM-PHD Filter for Multiple Target, Multiple Type Visual Tracking , 2019, J. Vis. Commun. Image Represent..

[4]  Ba-Ngu Vo,et al.  A Solution for Large-Scale Multi-Object Tracking , 2018, IEEE Transactions on Signal Processing.

[5]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ba-Ngu Vo,et al.  A Consistent Metric for Performance Evaluation of Multi-Object Filters , 2008, IEEE Transactions on Signal Processing.

[7]  G. C. Shephard,et al.  Convex Polytopes , 1967 .

[8]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[9]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[10]  R. Dobrushin Prescribing a System of Random Variables by Conditional Distributions , 1970 .

[11]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[12]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.