Visual language processing (VLP) of ancient manuscripts: Converting collections to windows on the past

Ancient manuscripts constitute a primary carrier of cultural heritage globally, and they are currently being intensively digitized all over the world to ensure their preservation, and, ultimately, the wide accessibility of their content. Critical to this research process are the legibility of the documents in image form, and access to live texts. Several state-of-the-art methods and approaches have been proposed and developed to address the challenges associated with processing these manuscripts. However, there is a huge amount of data involved, and also the high cost and scarcity of human expert feedback and reference data call for the development of fundamental approaches that encompass all these aspects in an objective and tractable manner. In this paper, we propose one such approach, which is a novel framework for the computational pattern analysis of ancient manuscripts that is data-driven, multilevel, self-sustaining, and learning-based, and takes advantage of the large quantities of unprocessed data available. Unlike many approaches, which fast-forward to the processing and analysis of feature vectors, our innovative framework represents a new perspective on the task, which starts from ground zero of the problem, which is the definition of objects. In addition, it leverages the data-driven mining of relations among objects to discover hidden but persistent links between them. The problem is addressed at three main levels. At the lowest level, that of images, it tackles automatic, data-driven enhancement and restoration of document images using spatial, spectral, sparse, and graph-based representations of visual objects. At the second level, which is transliteration, directed graphical models, HMMs, Undirected Random Fields, and spatial relations models are used to extract the live text of manuscript images, which reduces dependency on human experts. Finally, at the highest level, that of network analysis of the relations among objects (from patches and words to manuscripts and writers) involves the search for `social networks' linking manuscripts. Considering this approach under the umbrella of Visual Language Processing (VLP), we hope that it will be further enriched by the research community, in the form of new insights and approaches contributed at the various levels.

[1]  Mohamed Cheriet,et al.  Real-Time Knowledge-Based Processing of Images: Application of the Online NLPM Method to Perceptual Visual Analysis , 2012, IEEE Transactions on Image Processing.

[2]  Laurent Heutte,et al.  Spot It! Finding Words and Patterns in Historical Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[3]  Max Mignotte,et al.  A Multiresolution Markovian Fusion Model for the Color Visualization of Hyperspectral Images , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[4]  José A. Rodríguez-Serrano,et al.  Handwritten word-spotting using hidden Markov models and universal vocabularies , 2009, Pattern Recognit..

[5]  Mohamed Cheriet,et al.  Hyperspectral band selection based on graph clustering , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[6]  Chafic Mokbel,et al.  Dynamic and Contextual Information in HMM Modeling for Handwritten Word Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Max Mignotte,et al.  A hierarchical graph-based markovian clustering approach for the unsupervised segmentation of textured color images , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[8]  Samy Bengio,et al.  Writer adaptation techniques in HMM based Off-Line Cursive Script Recognition , 2002, Pattern Recognit. Lett..

[9]  Sebastian Nowozin,et al.  Loss-Specific Training of Non-Parametric Image Restoration Models: A New State of the Art , 2012, ECCV.

[10]  Hermann Ney,et al.  Analysis of Preprocessing Techniques for Latin Handwriting Recognition , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[11]  Josep Lladós,et al.  Unsupervised writer adaptation of whole-word HMMs with application to word-spotting , 2010, Pattern Recognit. Lett..

[12]  Thierry Paquet,et al.  Discrete CRF Based Combination Framework for Document Image Binarization , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[13]  Mohamed Cheriet,et al.  A multi-scale framework for adaptive binarization of degraded document images , 2010, Pattern Recognit..

[14]  Michael J. Black,et al.  Fields of Experts , 2009, International Journal of Computer Vision.

[15]  R. Manmatha,et al.  Partial duplicate detection for large book collections , 2011, CIKM '11.

[16]  Venu Govindaraju,et al.  Preprocessing of Low-Quality Handwritten Documents Using Markov Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[18]  Mohamed Cheriet,et al.  A new framework based on signature patches, micro registration, and sparse representation for optical text recognition , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[19]  Akinori Ito,et al.  Aspect-model-based reference speaker weighting , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Andrew Piper,et al.  Reading's Refrain: From Bibliography to Topology , 2013 .

[21]  Mohamed Cheriet,et al.  W-TSV: Weighted topological signature vector for lexicon reduction in handwritten Arabic documents , 2012, Pattern Recognit..

[22]  Eric T. Nalisnick,et al.  Extracting Sentiment Networks from Shakespeare's Plays , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[23]  Hany Ahmed,et al.  Off-Line Arabic Handwriting Recognition system based on concavity features and HMM classifier , 2012 .

[24]  Mohamed Cheriet,et al.  A learning framework for the optimization and automation of document binarization methods , 2013, Comput. Vis. Image Underst..

[25]  Chafic Mokbel,et al.  Combining Slanted-Frame Classifiers for Improved HMM-Based Arabic Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  William T. Freeman,et al.  What makes a good model of natural images? , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Cheung-Chi Leung,et al.  Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[28]  Rachid Hedjam VISUAL IMAGE PROCESSING IN VARIOUS REPRESENTATION SPACES FOR DOCUMENTARY HERITAGE PRESERVATION , 2013 .

[29]  Xiaodong Yu,et al.  Attribute-Based Transfer Learning for Object Categorization with Zero/One Training Example , 2010, ECCV.

[30]  Hermann Ney,et al.  Comparison of Bernoulli and Gaussian HMMs Using a Vertical Repositioning Technique for Off-Line Handwriting Recognition , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[31]  Roger Hsiao,et al.  Improving Reference Speaker Weighting Adaptation by the Use of Maximum-Likelihood Reference Speakers , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[32]  Mohamed Cheriet,et al.  Historical document image restoration using multispectral imaging system , 2013, Pattern Recognit..

[33]  Mohamed Cheriet,et al.  Beyond pixels and regions: A non-local patch means (NLPM) method for content-level restoration, enhancement, and reconstruction of degraded document images , 2011, Pattern Recognit..

[34]  R. F. Moghaddam,et al.  Low quality document image modeling and enhancement , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[35]  Michael Elad,et al.  Sparse and Redundant Representation Modeling—What Next? , 2012, IEEE Signal Processing Letters.

[36]  Jason M. Schwier,et al.  Inferring Statistically Significant Hidden Markov Models , 2013, IEEE Transactions on Knowledge and Data Engineering.

[37]  Mohamed Cheriet,et al.  DIAR: Advances in Degradation Modeling and Processing , 2008, ICIAR.

[38]  Qian Yao A UNIFIED TRAJECTORY TILING APPROACH TO HIGH QUALITY SPEECH RENDERING , 2013 .

[39]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[40]  Masakiyo Fujimoto,et al.  Prior-shared feature and model space speaker adaptation by consistently employing map estimation , 2013, Speech Commun..

[41]  Yair Weiss,et al.  From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[42]  Jean-Michel Morel,et al.  Image Denoising Methods. A New Nonlocal Principle , 2010, SIAM Rev..

[43]  Mohamed Cheriet,et al.  Feature Design for Offline Arabic Handwriting Recognition: Handcrafted vs Automated? , 2013, 2013 12th International Conference on Document Analysis and Recognition.