Needles in Haystacks: On Classifying Tiny Objects in Large Images

In some important computer vision domains, such as medical or hyperspectral imaging, we care about the classification of tiny objects in large images. However, most Convolutional Neural Networks (CNNs) for image classification were developed using biased datasets that contain large objects, in mostly central image positions. To assess whether classical CNN architectures work well for tiny object classification we build a comprehensive testbed containing two datasets: one derived from MNIST digits and one from histopathology images. This testbed allows controlled experiments to stress-test CNN architectures with a broad spectrum of signal-to-noise ratios. Our observations indicate that: (1) There exists a limit to signal-to-noise below which CNNs fail to generalize and that this limit is affected by dataset size - more data leading to better performances; however, the amount of training data required for the model to generalize scales rapidly with the inverse of the object-to-image ratio (2) in general, higher capacity models exhibit better generalization; (3) when knowing the approximate object sizes, adapting receptive field is beneficial; and (4) for very small signal-to-noise ratio the choice of global pooling operation affects optimization, whereas for relatively large signal-to-noise values, all tested global pooling operations exhibit similar performance.

[1]  Franccois Fleuret,et al.  Processing Megapixel Images with Deep Attention-Sampling Models , 2019, ICML.

[2]  Matthias Bethge,et al.  Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet , 2019, ICLR.

[3]  Zhihai Xu,et al.  $\mathcal{R}^2$ -CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Shaoqun Zeng,et al.  From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge , 2019, IEEE Transactions on Medical Imaging.

[5]  Saeed Hassanpour,et al.  Pathologist-level classification of histologic patterns on resected lung adenocarcinoma slides with deep neural networks , 2019, Scientific Reports.

[6]  H. Rolf Jäger,et al.  3D multirater RCNN for multimodal multiclass detection and characterisation of extremely small objects , 2018, MIDL.

[7]  S. Hassanpour,et al.  Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides , 2018, JAMA network open.

[8]  Saeed Hassanpour,et al.  Finding a Needle in the Haystack: Attention-Based Classification of High Resolution Microscopy Images , 2018, ArXiv.

[9]  Jan Kybic,et al.  Benchmarking of Image Registration Methods for Differently Stained Histological Slides , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[10]  Sai Saketh Chennamsetty,et al.  BACH: Grand challenge on breast cancer histology images , 2018, Medical Image Anal..

[11]  Kyunghyun Paeng,et al.  A Robust and Effective Approach Towards Accurate Metastasis Detection and pN-stage Classification in Breast Cancer , 2018, MICCAI.

[12]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[13]  Masashi Sugiyama,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[14]  Jascha Sohl-Dickstein,et al.  Sensitivity and Generalization in Neural Networks: an Empirical Study , 2018, ICLR.

[15]  Max Welling,et al.  Attention-based Deep Multiple Instance Learning , 2018, ICML.

[16]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[17]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[18]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Yoshua Bengio,et al.  Three Factors Influencing Minima in SGD , 2017, ArXiv.

[20]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[21]  Yoshua Bengio,et al.  A Closer Look at Memorization in Deep Networks , 2017, ICML.

[22]  Hao Chen,et al.  Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge , 2016, Medical Image Anal..

[23]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[24]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Andrea Vedaldi,et al.  Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[26]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Hugo Larochelle,et al.  Dynamic Capacity Networks , 2015, ICML.

[29]  Ivan Laptev,et al.  Is object localization for free? - Weakly-supervised learning with convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[31]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[32]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[35]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[36]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[37]  Marco Diani,et al.  Detection of small changes in airborne hyperspectral imagery: Experimental results over urban areas , 2011, 2011 6th International Workshop on the Analysis of Multi-temporal Remote Sensing Images (Multi-Temp).

[38]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Jianping Shi,et al.  R 2 -CNN: Fast Tiny Object Detection in Large-scale Remote Sensing Images , 2019 .

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Michael I. Jordan,et al.  The Handbook of Brain Theory and Neural Networks , 2002 .

[42]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[43]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[44]  Ian W. Ricketts,et al.  The Mammographic Image Analysis Society digital mammogram database , 1994 .