Large-Scale Object Detection of Images from Network Cameras in Variable Ambient Lighting Conditions

Computer vision relies on labeled datasets for training and evaluation in detecting and recognizing objects. The popular computer vision program, YOLO ("You Only Look Once"), has been shown to accurately detect objects in many major image datasets. However, the images found in those datasets, are independent of one another and cannot be used to test YOLO's consistency at detecting the same object as its environment (e.g. ambient lighting) changes. This paper describes a novel effort to evaluate YOLO's consistency for large-scale applications. It does so by working (a) at large scale and (b) by using consecutive images from a curated network of public video cameras deployed in a variety of real-world situations, including traf?c intersections, national parks, shopping malls, university campuses, etc. We speci?cally examine YOLO's ability to detect objects in different scenarios (e.g., daytime vs. night), leveraging the cameras' ability to rapidly retrieve many successive images for evaluating detection consistency. Using our camera network and advanced computing resources (supercomputers), we analyzedmorethan5millionimagescapturedby140network cameras in 24 hours. Compared with labels marked by humans (considered as "ground truth"), YOLO struggles to consistently detect the same humans and cars as their positions change from one frame to the next; it also struggles to detect objects at night time. Our ?ndings suggest that state-of-the art vision solutions should be trained by data from network camera with contextual information before they can be deployed in applications that demand high consistency on object detection.

[1]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[2]  Robert Pless,et al.  The global network of outdoor webcams: properties and applications , 2009, GIS.

[3]  Ryan Dailey,et al.  Creating the World's Largest Real-Time Camera Network , 2017 .

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[8]  Lianwen Jin,et al.  A New CNN-Based Method for Multi-Directional Car License Plate Detection , 2018, IEEE Transactions on Intelligent Transportation Systems.

[9]  George K. Thiruvathukal,et al.  Comparison of Visual Datasets for Machine Learning , 2017, 2017 IEEE International Conference on Information Reuse and Integration (IRI).

[10]  Sujata Chaudhari,et al.  Yolo Real Time Object Detection , 2020 .

[11]  Thomas B. Moeslund,et al.  Evaluating State-of-the-Art Object Detector on Challenging Traffic Light Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Yiran Chen,et al.  Three years of low-power image recognition challenge: Introduction to special session , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[14]  Soonhoi Ha,et al.  Joint optimization of speed, accuracy, and energy for embedded image recognition systems , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[15]  M. A. Al-masni,et al.  Detection and classification of the breast abnormalities in digital mammograms via regional Convolutional Neural Network , 2017, 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).