Few-Shot Pill Recognition

Pill image recognition is vital for many personal/public health-care applications and should be robust to diverse unconstrained real-world conditions. Most existing pill recognition models are limited in tackling this challenging few-shot learning problem due to the insufficient instances per category. With limited training data, neural network-based models have limitations in discovering most discriminating features, or going deeper. Especially, existing models fail to handle the hard samples taken under less controlled imaging conditions. In this study, a new pill image database, namely CURE, is first developed with more varied imaging conditions and instances for each pill category. Secondly, a W2-net is proposed for better pill segmentation. Thirdly, a Multi-Stream (MS) deep network that captures task-related features along with a novel two-stage training methodology are proposed. Within the proposed framework, a Batch All strategy that considers all the samples is first employed for the sub-streams, and then a Batch Hard strategy that considers only the hard samples mined in the first stage is utilized for the fusion network. By doing so, complex samples that could not be represented by one type of feature could be focused and the model could be forced to exploit other domain-related information more effectively. Experiment results show that the proposed model outperforms state-of-the-art models on both the National Institute of Health (NIH) and our CURE database.

[1]  Jesus J. Caban,et al.  Automatic identification of prescription drugs using shape distribution models , 2012, 2012 19th IEEE International Conference on Image Processing.

[2]  Xiang Bai,et al.  Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Anuja Roy,et al.  Medication compliance and persistence: terminology and definitions. , 2008, Value in health : the journal of the International Society for Pharmacoeconomics and Outcomes Research.

[4]  Anil K. Jain,et al.  PILL-ID: Matching and Retrieval of Drug Pill Imprint Images , 2010, 2010 20th International Conference on Pattern Recognition.

[5]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[6]  Bernt Schiele,et al.  Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[8]  Xiao Zeng,et al.  MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images , 2017, MobiSys.

[9]  Wolfram Burgard,et al.  Multimodal deep learning for robust RGB-D object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[11]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[12]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[13]  Kai Fan,et al.  Zero-Shot Learning via Class-Conditioned Deep Generative Models , 2017, AAAI.

[14]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[15]  Tsendsuren Munkhdalai,et al.  Rapid Adaptation with Conditionally Shifted Neurons , 2017, ICML.

[16]  Paolo Frasconi,et al.  Bilevel Programming for Hyperparameter Optimization and Meta-Learning , 2018, ICML.

[17]  Henry A Spiller,et al.  Increasing burden of pill identification requests to US Poison Centers , 2009, Clinical toxicology.

[18]  Susan Hodgson,et al.  Disaster-Driven Evacuation and Medication Loss: a Systematic Literature Review , 2014, PLoS currents.

[19]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[21]  Thomas L. Griffiths,et al.  Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[22]  Jiri Matas,et al.  Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Patrick Le Callet,et al.  The Role of Structure and Textural Information in Image Utility and Quality Assessment Tasks , 2018, HVEI.

[25]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[26]  Anil K. Jain,et al.  Pill-ID: Matching and retrieval of drug pill images , 2012, Pattern Recognit. Lett..

[27]  Alexandre Lacoste,et al.  TADAM: Task dependent adaptive metric for improved few-shot learning , 2018, NeurIPS.

[28]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[29]  Olivier Bodenreider,et al.  The national library of medicine pill image recognition challenge: An initial report , 2016, 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[30]  Linda G. Shapiro,et al.  ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[32]  Telmo Adão,et al.  HelpmePills: A Mobile Pill Recognition Tool for Elderly Persons☆ , 2014 .

[33]  Jie Yang,et al.  Accurate system for automatic pill recognition using imprint information , 2015, IET Image Process..

[34]  Xiaogang Wang,et al.  Finding Task-Relevant Features for Few-Shot Learning by Category Traversal , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).