Fast Image-based Chinese Calligraphic Character Retrieval on Large Scale Data

Chinese calligraphy is the art of handwriting, it draws a lot of attention for its beauty and elegance. In CADAL, a Calligraphic Character Dictionary (CCD) which contains hundreds of thousands of character images labeled with semantic meaning has been constructed and provided online to common users. It is a great challenge to perform quick and accurate image-based calligraphic character retrieval on CCD. In this paper, a novel shape descriptor, Oriented Shape Context (OSC) is proposed basing on the tranditional Shape Context (SC) to perform similarity searching. Together with GIST, GIST-OSC descriptor is proposed to represent calligraphic character image for efficient and effective retrieval. In addition, an effective retrieval schema is proposed. The retrieval schema works in two steps. Firstly approximate nearest neighbors of the query image are found quickly using GIST and then one-to-one fine matching between approximate nearest neighbors and the query image is performed using OSC. Our experiments show that the GIST-OSC descriptor and the retrieval schema are efficient and effective for Chinese calligraphic character retrieval on large scale data.

[1]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[2]  Yunhe Pan,et al.  Automatic generation of artistic chinese calligraphy , 2004, IEEE Intelligent Systems.

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  Gregory Shakhnarovich,et al.  Learning task-specific similarity , 2005 .

[5]  Kai Yu,et al.  Chinese calligraphy specific style rendering system , 2010, JCDL '10.

[6]  Liqing Zhang,et al.  Edgel index for large-scale sketch-based image search , 2011, CVPR 2011.

[7]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[9]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[10]  Pengcheng Gao,et al.  LSH-based large scale chinese calligraphic character recognition , 2013, JCDL '13.

[11]  Pengcheng Gao,et al.  Ontology‐based model for Chinese Calligraphy Synthesis , 2013, Comput. Graph. Forum.

[12]  Hsi-Jian Lee,et al.  Dual-binarization and anisotropic diffusion of Chinese characters in calligraphy documents , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[13]  Daphna Weinshall,et al.  Flexible Syntactic Matching of Curves and Its Application to Automatic Hierarchical Classification of Silhouettes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Peng Liu,et al.  Calligraphy Beautification Method for Chinese Handwritings , 2012, 2012 Fourth International Conference on Digital Home.

[15]  Yueting Zhuang,et al.  Web based Chinese Calligraphy Learning with 3-D Visualization Method , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[16]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[17]  Yueting Zhuang,et al.  Retrieval of Chinese Calligraphic Character Image , 2004, PCM.

[18]  Anuj Srivastava,et al.  Analysis of planar shapes using geodesic paths on shape spaces , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yueting Zhuang,et al.  Skeleton-Based Recognition of Chinese Calligraphic Character Image , 2008, PCM.

[20]  Jun Yu,et al.  Pairwise constraints based multiview features fusion for scene classification , 2013, Pattern Recognit..

[21]  Meng Wang,et al.  Adaptive Hypergraph Learning and its Application in Image Classification , 2012, IEEE Transactions on Image Processing.

[22]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[23]  Andrew Blake,et al.  Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Irwin Sobel,et al.  An Isotropic 3×3 image gradient operator , 1990 .