论文信息 - Text Detection and Translation from Natural Scenes

Text Detection and Translation from Natural Scenes

Abstract : The authors present a system for automatic extraction and interpretation of signs from a natural scene. The system is capable of capturing images, detecting and recognizing signs, and translating them into a target language. The translation can be displayed on a hand-held wearable display or a head-mounted display. It can also be synthesized as a voice output message over the earphones. The paper addresses challenges in automatic sign extraction and translation, describes methods for automatic sign extraction, and extends example-based machine translation technology for sign translation. The authors use a user-centered approach in system development that takes advantage of human intelligence and leverages human capabilities. They are currently working on Chinese sign translation. So far, they have developed a prototype system that can recognize Chinese signs from a video camera and then translate them either into English text or a voice stream. They have built a database containing about 800 Chinese signs for development and evaluation. The authors hope that the sign translation, in conjunction with spoken language translation, will help international tourists overcome language barriers. The technology could also help a visually handicapped person increase his or her environmental awareness.

Alex Waibel | Ying Zhang | Jie Yang | Jiang Gao

[1] Ralf D. Brown,et al. Automated Generalization of Translation Examples , 2000, COLING.

[2] Alexander H. Waibel,et al. Interactive Translation of Conversational Speech , 1996, Computer.

[3] David S. Doermann,et al. Automatic identification of text in digital video key frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4] David S. Doermann,et al. Superresolution-based enhancement of text in digital video , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[5] Qian Huang,et al. Character extraction of license plates from video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Ellen K. Hughes,et al. Video OCR for digital news archive , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[7] Alexander H. Waibel,et al. Multimodal people ID for a multimedia meeting browser , 1999, MULTIMEDIA '99.

[8] William M. Newman,et al. Documents through cameras , 1999, Image Vis. Comput..

[9] Anil K. Jain,et al. Locating text in complex color images , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[10] Edward K. Wong,et al. A robust algorithm for text extraction in color video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[11] Ellen K. Hughes,et al. Video OCR for Digital News Archives , 1998 .

[12] Yasuhiko Watanabe,et al. Translation camera , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[13] Shigeru Akamatsu,et al. Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[14] Robert E. Frederking,et al. An evaluation of the multi-engine MT architecture , 1998, AMTA.

[15] Rainer Lienhart,et al. Automatic text recognition for video indexing , 1997, MULTIMEDIA '96.

[16] Ralf D. Brown,et al. Example-Based Machine Translation in the Pangloss System , 1996, COLING.

[17] Alexander H. Waibel,et al. Smart Sight: a tourist assistant system , 1999, Digest of Papers. Third International Symposium on Wearable Computers.

[18] Nigel G. Ward. Machine Translation: Past, Present, Future , 2001 .

[19] Anil K. Jain,et al. Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[20] Edward M. Riseman,et al. TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..