Smart Library: Identifying Books on Library Shelves Using Supervised Deep Learning for Scene Text Reading

Physical library collections are valuable and long standing resources for knowledge and learning. However, managing and finding books or other volumes on a large collection of bookshelves often leads to tedious manual work, especially for large collections where books or others might be missing or misplaced. Recently, deep neural-based models have been successful in detecting and recognizing text in images taken from natural scenes. Based on this, we investigate deep learning for facilitating book management. This task introduces further challenges including image distortion and varied lighting conditions. We present a library inventory building and retrieval system based on scene text reading. We specifically design our text recognition model using rich supervision to accelerate training and achieve state-of-the- art performance on several benchmark datasets. Our proposed system has the potential to greatly reduce the amount of manual labor required for managing book inventories.

[1]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[2]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[3]  Andrew Zisserman,et al.  Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[4]  Wenyu Liu,et al.  Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[7]  Cheng-Hsin Hsu,et al.  Building book inventories using smartphones , 2010, ACM Multimedia.

[8]  Pan He,et al.  Reading Scene Text in Deep Convolutional Sequences , 2015, AAAI.

[9]  Huizhong Chen,et al.  Combining image and text features: a hybrid approach to mobile book spine recognition , 2011, ACM Multimedia.

[10]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[11]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[12]  M. P. Nevetha,et al.  Automatic Book Spine Extraction and Recognition for Library Inventory Management , 2015, WCI '15.

[13]  Wenyi Huang,et al.  Aggregating Local Context for Accurate Scene Text Detection , 2016, ACCV.

[14]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[16]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[17]  Robinson Piramuthu,et al.  Region-Based Discriminative Feature Pooling for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Dikshant Shahi Apache Solr , 2015, Apress.

[19]  Yuchou Chang,et al.  Matching book-spine images for library shelf-reading process automation , 2008, 2008 IEEE International Conference on Automation Science and Engineering.

[20]  Won-Ho Choi,et al.  A Framework for Recognition Books on Bookshelves , 2009, ICIC.