From Pixels to Words: A Scalable Journey of Text Information from Product Images to Retail Catalog
暂无分享,去创建一个
Kunal Banerjee | Uddipto Dutta | Vijay Srinivas Agneeswaran | Pranay Dugar | Rajesh Shreedhar Bhat | Anirban Chatterjee | Asit Sharad Tarsode | Anirban Chatterjee | K. Banerjee | Uddipto Dutta | A. Tarsode | Pranay Dugar | R. Bhat
[1] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[2] Olatunji Ruwase,et al. DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters , 2020, KDD.
[3] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[4] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] C. Thum,et al. Measurement of the Entropy of an Image with Application to Image Focusing , 1984 .
[7] Jiřı́ Matas,et al. Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[8] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[9] Samyam Rajbhandari,et al. DeepSpeed , 2020, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
[10] Seong Joon Oh,et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[11] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[12] Ankush Gupta,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Furu Wei,et al. LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding , 2020, ACL.
[14] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[15] Cláudio Rosito Jung,et al. License Plate Detection and Recognition in Unconstrained Scenarios , 2018, ECCV.
[16] Xiang Bai,et al. TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.
[17] Simon Osindero,et al. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.
[19] Dongyoon Han,et al. Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Derek Hoiem,et al. Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.
[21] Xiang Bai,et al. Symmetry-based text line detection in natural scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Wenyu Liu,et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network , 2016, AAAI.
[23] Palaiahnakote Shivakumara,et al. Recognizing Text with Perspective Distortion in Natural Scenes , 2013, 2013 IEEE International Conference on Computer Vision.
[24] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[25] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[26] Palaiahnakote Shivakumara,et al. A robust arbitrary text detection system for natural scene images , 2014, Expert Syst. Appl..
[27] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.
[28] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[29] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.