Multi-class segmentation of free-form online documents with tree conditional random fields

We present a new system for predicting the segmentation of online handwritten documents into multiple blocks, such as text paragraphs, tables, graphics, or mathematical expressions. A hierarchical representation of the document is adopted by aggregating strokes into blocks, and interactions between different levels are modeled in a tree Conditional Random Field. Features are extracted, and labels are predicted at each tree level with logistic classifiers, and Belief Propagation is adopted for optimal inference over the structure. Being fully trainable, the system is shown to properly handle difficult segmentation problems arising in unconstrained online note-taking documents, where no prior knowledge is available regarding the layout or the expected content. Our experiments show very promising results and allow to envision fully automatic segmentation of free-form online notes.

[1]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[2]  Éric Anquetil,et al.  HBF49 feature set: A first unified baseline for online symbol recognition , 2013, Pattern Recognit..

[3]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Rongrong Wang,et al.  Table detection in online ink notes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Cheng-Lin Liu,et al.  Context modeling for text/non-text separation in free-form online handwritten documents , 2013, Electronic Imaging.

[7]  Cheng-Lin Liu,et al.  Graphics Extraction from Heterogeneous Online Documents with Hierarchical Random Fields , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[8]  Laurent Heutte,et al.  Unconstrained Handwritten Document Layout Extraction Using 2D Conditional Random Fields , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[9]  Guozhong Dai,et al.  Extraction and segmentation of tables from Chinese ink documents based on a matrix model , 2007, Pattern Recognit..

[10]  Martin Szummer Learning diagram parts with hidden random fields , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[11]  Venu Govindaraju,et al.  Multi-scale techniques for document page segmentation , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[12]  Anil K. Jain,et al.  Structure in on-line documents , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[13]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[14]  Cheng-Lin Liu,et al.  A robust approach to text line grouping in online handwritten Japanese documents , 2009, Pattern Recognit..

[15]  Insa de Rennes,et al.  Modeling Relative Positioning of Handwritten Patterns , 2009 .

[16]  David Jones,et al.  Discerning structure from freeform handwritten notes , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[17]  Christopher M. Bishop,et al.  Distinguishing text from graphics in on-line handwritten ink , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[18]  Marc Toussaint,et al.  Multi-class image segmentation using conditional random fields and global classification , 2009, ICML '09.

[19]  Louis Vuurpijl,et al.  Mode detection in on-line pen drawing and handwriting recognition , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[20]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[21]  Laurent Heutte,et al.  A New Hierarchical Handwritten Document Layout Extraction Based on Conditional Random Field Modeling , 2010, 2010 12th International Conference on Frontiers in Handwriting Recognition.

[22]  Sebastian Otte,et al.  Local Feature Based Online Mode Detection with Recurrent Neural Networks , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[23]  Cheng-Lin Liu,et al.  Text/Non-text Classification in Online Handwritten Documents with Conditional Random Fields , 2012, CCPR.

[24]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Horst Bunke,et al.  Text versus non-text distinction in online handwritten documents , 2010, SAC '10.

[26]  Volkmar Frinken,et al.  Mode Detection in Online Handwritten Documents Using BLSTM Neural Networks , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[27]  Marcus Liwicki,et al.  MCS for Online Mode Detection: Evaluation on Pen-Enabled Multi-touch Interfaces , 2011, 2011 International Conference on Document Analysis and Recognition.

[28]  Joost van de Weijer,et al.  Harmony Potentials , 2011, International Journal of Computer Vision.

[29]  Sanja Fidler,et al.  Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Cheng-Lin Liu,et al.  Contextual text/non-text stroke classification in online handwritten notes with conditional random fields , 2014, Pattern Recognit..

[31]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[32]  Aurélie Lemaitre,et al.  Multiresolution cooperation makes easier document structure recognition , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[33]  Marcus Liwicki,et al.  On-Line Handwritten Text Line Detection Using Dynamic Programming , 2007 .

[34]  Fei Yin,et al.  Handwritten Chinese text line segmentation by clustering with distance metric learning , 2009, Pattern Recognit..

[35]  Cheng-Lin Liu,et al.  Text/Non-text Ink Stroke Classification in Japanese Handwriting Based on Markov Random Fields , 2007 .

[36]  Sebastian Nowozin,et al.  On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[37]  Marcus Liwicki,et al.  IAMonDo-database: an online handwritten document database with non-uniform contents , 2010, DAS '10.

[38]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[39]  Thierry Artières,et al.  On-line handwritten documents segmentation , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[40]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[41]  Kevin P. Murphy,et al.  Figure-ground segmentation using a hierarchical conditional random field , 2007, Fourth Canadian Conference on Computer and Robot Vision (CRV '07).

[42]  Martial Hebert,et al.  Man-made structure detection in natural images using a causal multiscale random field , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[43]  Balaraman Ravindran,et al.  Image Modeling Using Tree Structured Conditional Random Fields , 2007, IJCAI.

[44]  Zhipeng Luo,et al.  Conditional Random Fields , 2014 .