Towards Automated Transcription of Label Text from Pinned Insect Collections

We present a computer vision system that can transcribe the text on tiny printed labels stacked beneath pinned insects (as found in museum collections). The approach uses multiple views of each label because the labels are often occluded by the pin, the insect specimen, and other labels. Our approach handles occlusion and the extreme viewing angles required to image the stacked labels. Automated image analysis identifies the lines of text and then aligns and rectifies the images. Combining the aligned and rectified images from multiple viewpoints enables us to create a composite image that can be read using optical character recognition tools (OCR) to extract the text. We provide experimental demonstration using both museum specimens and experimental test labels.

[1]  Patrick Pérez,et al.  Poisson image editing , 2003, ACM Trans. Graph..

[2]  David S. Doermann,et al.  Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Paul Flemons,et al.  Image based Digitisation of Entomology Collections: Leveraging volunteers to increase digitization capacity , 2012, ZooKeys.

[4]  Tom E. Bishop,et al.  The Light Field Camera: Extended Depth of Field, Aliasing, and Superresolution , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Jie Gu,et al.  Text line extraction of curved document images using hybrid metric , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[6]  Arturo H. Ariño APPROACHES TO ESTIMATING THE UNIVERSE OF NATURAL HISTORY COLLECTIONS DATA , 2010 .

[7]  Carolyn L. Rose,et al.  Preserving natural science collections: chronicle of our environmental heritage , 1993 .

[8]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yuandong Tian,et al.  Rectification and 3D reconstruction of curved document images , 2011, CVPR 2011.

[10]  Katsushi Ikeuchi,et al.  Multiview Rectification of Folded Documents , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[12]  David S. Doermann,et al.  Flattening curved documents in images , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Shijian Lu,et al.  Perspective rectification of document images using fuzzy set and morphological operations , 2005, Image Vis. Comput..

[14]  Jan Kautz,et al.  Exposure Fusion , 2009, 15th Pacific Conference on Computer Graphics and Applications (PG'07).

[15]  Arfon Smith,et al.  The notes from nature tool for unlocking biodiversity records from museum records through citizen science , 2012, ZooKeys.

[16]  Meenakshisundaram Gopi,et al.  Automatic Detection of Histological Artifacts in Mouse Brain Slice Images , 2016, MCV/BAMBI@MICCAI.

[17]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[18]  Meenakshisundaram Gopi,et al.  Robust registration of Mouse brain slices with severe histological artifacts , 2016, ICVGIP '16.

[19]  P. Hanrahan,et al.  Light Field Photography with a Hand-held Plenoptic Camera , 2005 .

[20]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[21]  Hannu Saarenmaa World's first automated mass digitization line for pinned insects , 2016 .

[22]  Vincent S. Smith,et al.  Report on trial of SatScan tray scanner system by SmartDrive Ltd. , 2010 .

[23]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[24]  Hannu Saarenmaa,et al.  High‐performance digitization of natural history collections: Automated imaging lines for herbarium and insect specimens , 2014 .

[25]  John La Salle,et al.  Whole-drawer imaging for digital management and curation of a large entomological collection , 2012, ZooKeys.

[26]  C. V. Jawahar,et al.  Perspective Correction Methods for Camera-Based Document Analysis , 2005 .

[27]  Jitendra Malik,et al.  Depth from Combining Defocus and Correspondence Using Light-Field Cameras , 2013, 2013 IEEE International Conference on Computer Vision.