Fine-tuning deep CNN models on specific MS COCO categories

Fine-tuning of a deep convolutional neural network (CNN) is often desired. This paper provides an overview of our publicly available py-faster-rcnn-ft software library that can be used to fine-tune the VGG_CNN_M_1024 model on custom subsets of the Microsoft Common Objects in Context (MS COCO) dataset. For example, we improved the procedure so that the user does not have to look for suitable image files in the dataset by hand which can then be used in the demo program. Our implementation randomly selects images that contain at least one object of the categories on which the model is fine-tuned.

[1]  Daniel Sonntag,et al.  Overview of the CPS for Smart Factories Project: Deep Learning, Knowledge Acquisition, Anomaly Detection and Intelligent User Interfaces , 2017 .

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Daniel Sonntag,et al.  Evaluating Remote and Head-worn Eye Trackers in Multi-modal Speech-based HRI (Demo) , 2017, HRI.

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  András Lörincz,et al.  Deep Gestalt Reasoning Model: Interpreting Electrophysiological Signals Related to Cognition , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[8]  Daniel Sonntag,et al.  Speech-based Medical Decision Support in VR using a Deep Neural Network (Demonstration) , 2017, IJCAI.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[10]  András Lörincz,et al.  Towards reasoning based representations: Deep Consistence Seeking Machine , 2018, Cognitive Systems Research.

[11]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.