Vision Based Assistive System for Label Detection with Voice Output

A camera based assistive text reading framework to help blind persons read text labels and product packaging from hand-held object in their daily resides is proposed. To isolate the object from cluttered backgrounds or other surroundings objects in the camera view, we propose an efficient and effective motion based method to define a region of interest (ROI) in the video by asking the user to shake the object. In the extracted ROI, text localization and recognition are conducted to acquire text information. To automatically localize the text regions from the object ROI, we propose a novel text localization algorithm by learning gradient features of stroke orientations and distributions of edge pixels in an Adaboost model. Text characters in the localized text regions are then binarized and recognized by off-the- shelf optical character recognition software. The recognized text codes are output to blind users in speech. Keywords-Assistive devices, blindness, distribution of edge pixels, hand-held objects, optical character recognition (OCR), stroke orientation, text reading and text region localization.

[1]  Kuo-Chin Fan,et al.  A Novel Character Segmentation Method for Text Images Captured by Cameras , 2010 .

[2]  Max Lu,et al.  Robust and efficient foreground analysis for real-time video surveillance , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Xiaodong Yang,et al.  Context-based indoor object detection as an aid to blind persons accessing unfamiliar environments , 2010, ACM Multimedia.

[4]  Yingli Tian,et al.  Localizing Text in Scene Images by Boundary Clustering, Stroke Segmentation, and String Fragment Classification , 2012, IEEE Transactions on Image Processing.

[5]  Long Ma,et al.  Text detection in natural images based on multi-scale edge detetion and classification , 2010, 2010 3rd International Congress on Image and Signal Processing.

[6]  Xiaodong Yang Recognizing clothes patterns for blind people by confidence margin based feature combination , 2011, ACM Multimedia.

[7]  Chucai Yi,et al.  Text String Detection From Natural Scenes by Structure-Based Partition and Grouping , 2011, IEEE Transactions on Image Processing.

[8]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Xilin Chen,et al.  Detection of text on road signs from video , 2005, IEEE Trans. Intell. Transp. Syst..

[10]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[11]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Deepak Kumar,et al.  Multi-script robust reading competition in ICDAR 2013 , 2013, MOCR '13.

[13]  Sunil Kumar,et al.  Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model , 2007, IEEE Transactions on Image Processing.