Towards Automatic Diagram Description for the Blind

Conventional methods for describing complex diagrams to blind people are either ineffective or inefficient. We describe our preliminary work on how one can describe a diagram to a blind person with minimal supervision. We explain our text localization method which beats commercially available off-the-shelf state-of-the-art systems. We also provide a prototype user interface which can effectively describe diagrams based on our user study.

[1]  Yu-Bin Yang,et al.  Text detection based on convolutional neural networks with spatial pyramid pooling , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[2]  Karl Stratos,et al.  Large Scale Retrieval and Generation of Image Descriptions , 2015, International Journal of Computer Vision.

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Thanarat H. Chalidabhongse,et al.  Thai text localization in natural scene images using Convolutional Neural Network , 2014, Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific.

[5]  T. Kumuda,et al.  Detection and localization of text from natural scene images using texture features , 2015, 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC).

[6]  Jeffrey P. Bigham,et al.  WebAnywhere: a screen reader on-the-go , 2008, W4A '08.

[7]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[8]  Vikky Mohane,et al.  Object recognition for blind people using portable camera , 2016, 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave).

[9]  Rama Chellappa,et al.  The Design and Preliminary Evaluation of a Finger-Mounted Camera and Feedback System to Enable Reading of Printed Text for the Blind , 2014, ECCV Workshops.

[10]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.