Shape analysis for the automated identification of plants from images of leaves.

Species identification is a necessary component of most studies of biological diversity, and computational approaches are beginning to automate it. In particular, leaves of plants provide taxon-specific information that has successfully been applied to plant identification. Prior studies have not investigated the number of leaves or the resolution of the digitized leaf image required to represent a species' shape. Moreover, the relationship between accuracy and the size of the leaf shape database, and methods to integrate automated approaches with more traditional dichotomous keys have yet to be explored. Here, I use a database of 2,420 leaves from 151 species to address these issues. Using distance metrics derived from Fourier and Procrustes analyses, it is found that a minimum of 10 leaves of each species, 100 margin points, and 10 Fourier harmonics are required to accurately represent leaf shape of a species. These results are used to assess the success of species identification from images of leaves: 72% for all 151 species. The tight relationship between database size and accuracy is then used in conjunction with results from probability theory to predict accuracy of species identification when dichotomous multiple-entry keys and combined Fourier and Procrustes analysis are used together. Combining these two approaches to identification can greatly improve identification accuracy. Open-source software is available to implement the automated distance-based approach.

[1]  E. W. Sinnott,et al.  THE CLIMATIC DISTRIBUTION OF CERTAIN TYPES OF ANGIOSPERM LEAVES , 1916 .

[2]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[3]  Donald V. Osborne,et al.  SOME ASPECTS OF THE THEORY OF DICHOTOMOUS KEYS , 1963 .

[4]  M. J. Dallwitz,et al.  A Flexible Computer Program for Generating Identification Keys , 1974 .

[5]  R. J. Pankhurst,et al.  Biological Identification with Computers , 1976 .

[6]  W. Atchley,et al.  Statistical Properties of Ratios. I. Empirical Results , 1976 .

[7]  P. Dodson On the Use of Ratios in Growth Studies , 1978 .

[8]  D. A. Preece,et al.  Identification Keys and Diagnostic Tables: a Review , 1980 .

[9]  Fred L. Bookstein,et al.  A Comment upon the Uses of Fourier Methods in Systematics , 1982 .

[10]  Dwight T. Kincaid,et al.  Quantification of leaf shape with a microcomputer and Fourier transform , 1983 .

[11]  F. Rohlf,et al.  A COMPARISON OF FOURIER METHODS FOR THE DESCRIPTION OF WING SHAPE IN MOSQUITOES (DIPTERA: CULICIDAE) , 1984 .

[12]  I. Noble,et al.  Analyses of digitised leaf images of the Dodonaea viscosa complex in Australia , 1984 .

[13]  S. Ferson,et al.  Measuring shape variation of two-dimensional outlines , 1985 .

[14]  J. Felsenstein Phylogenies and the Comparative Method , 1985, The American Naturalist.

[15]  R. J. Pankhurst A package of computer programs for handling taxonomic databases , 1986, Comput. Appl. Biosci..

[16]  W. H. Parker,et al.  ANOTHER APPROACH TO LEAF SHAPE COMPARISONS , 1987 .

[17]  A. Lesk COMPUTATIONAL MOLECULAR BIOLOGY , 1988, Proceeding of Data For Discovery.

[18]  Honor C. Prentice,et al.  Automated image acquisition and morphometric description , 1988 .

[19]  R. W. Payne,et al.  A study of criteria for constructing identification keys containing tests with unequal costs. , 1989 .

[20]  Trained and Untrained Individual's Ability to Identify Morphological Characters of Immature Grasses , 1989 .

[21]  F. Rohlf,et al.  Extensions of the Procrustes Method for the Optimal Superimposition of Landmarks , 1990 .

[22]  E. Franz,et al.  Shape description of completely-visible and partially-occluded leaves for identifying plants in digital images. , 2016 .

[23]  S. Lele,et al.  Some comments on coordinate-free and scale-invariant methods in morphometrics. , 1991, American journal of physical anthropology.

[24]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .

[25]  Thomas S. Ray,et al.  LANDMARK EIGENSHAPE ANALYSIS: HOMOLOGOUS CONTOURS: LEAF SHAPE IN SYNGONIUM (ARACEAE) , 1992 .

[26]  S. Lele Euclidean Distance Matrix Analysis (EDMA): Estimation of mean form and mean form difference , 1993 .

[27]  George E. Meyer,et al.  Shape features for identifying young weeds using image analysis , 1994 .

[28]  Sanjeev R. Kulkarni,et al.  Rates of convergence of nearest neighbor estimation under arbitrary sampling , 1995, IEEE Trans. Inf. Theory.

[29]  Arthur M. Lesk,et al.  Three-Dimensional Pattern Matching in Protein Structure Analysis , 1995, CPM.

[30]  A. Premoli Leaf architecture of South American Nothofagus (Nothofagaceae) using traditional and new methods in morphometrics , 1996 .

[31]  Josef Kittler,et al.  Reliable Classification of Chrysanthemum Leaves through Curvature Scale Space , 1997, Scale-Space.

[32]  Pete E. Lestrel,et al.  Fourier Descriptors and their Applications in Biology , 2008 .

[33]  J. Endler,et al.  The Relative Success of Some Methods for Measuring and Describing the Shape of Complex Objects , 1998 .

[34]  Takeshi Saitoh,et al.  Automatic recognition of wild flowers , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[35]  J L Edwards,et al.  Interoperability of biodiversity databases: biodiversity information on every desktop. , 2000, Science.

[36]  F. Bisby The quiet revolution: biodiversity informatics and the internet. , 2000, Science.

[37]  Marcel Rejmánek,et al.  Vegetative Identification of Tropical Woody Plants: State of the Art and Annotated Bibliography1 , 2001 .

[38]  R. Thorne How many species of seed plants are there , 2001 .

[39]  Oskar Söderkvist,et al.  Computer Vision Classification of Leaves from Swedish Trees , 2001 .

[40]  John Alroy,et al.  How many named species are valid? , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[41]  Jeremy R. deWaard,et al.  Biological identifications through DNA barcodes , 2003, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[42]  Robert W. Scotland,et al.  How many species of seed plants are there , 2003 .

[43]  Victor J. Rayward-Smith,et al.  Algorithms for Identification Key Generation and Optimization with Application to Yeast Identification , 2003, EvoWorkshops.

[44]  Zhiyong Wang,et al.  Shape based leaf image retrieval , 2003 .

[45]  F. Rohlf,et al.  Geometric morphometrics: Ten years of progress following the ‘revolution’ , 2004 .

[46]  Sadegh Abbasi,et al.  Matching shapes with self-intersections:application to leaf classification , 2004, IEEE Transactions on Image Processing.

[47]  Kevin Warwick,et al.  Artificial Keys for Botanical Identification using a Multilayer Perceptron Neural Network (MLP) , 2004, Artificial Intelligence Review.

[48]  M. O'Neill,et al.  Automated species identification: why not? , 2004, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[49]  Guojun Lu,et al.  Review of shape representation and description techniques , 2004, Pattern Recognit..

[50]  P. Hebert,et al.  The promise of DNA barcoding for taxonomy. , 2005, Systematic biology.

[51]  Yunyoung Nam,et al.  A Shape-Based Retrieval Scheme for Leaf Images , 2005, PCM.

[52]  Anuj Srivastava,et al.  Statistical shape analysis: clustering, learning, and testing , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Xiaofeng Wang,et al.  Shape Matching and Recognition Base on Genetic Algorithm and Application to Plant Species Identification , 2005, ICIC.

[54]  Sean White,et al.  First steps toward an electronic field guide for plants , 2006 .

[55]  De-shuang Huang,et al.  Computer-Aided Plant Species Identification (CAPSI) Based on Leaf Shape Matching Technique , 2006 .

[56]  A. Samal,et al.  Plant species identification using Elliptic Fourier leaf shape analysis , 2006 .

[57]  Enrique Vidal,et al.  Learning weighted metrics to minimize nearest-neighbor classification error , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Zheru Chi,et al.  Combined thresholding and neural network approach for vein pattern extraction from leaf images , 2006 .

[59]  Chia-Ling Lee,et al.  Classification of leaf images , 2006, Int. J. Imaging Syst. Technol..

[60]  Yunyoung Nam,et al.  A Venation-Based Leaf Image Classification Scheme , 2006, AIRS.

[61]  Xiaofeng Wang,et al.  Leaf shape based plant species recognition , 2007, Appl. Math. Comput..

[62]  Haibin Ling,et al.  Shape Classification Using the Inner-Distance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  David J Hearn,et al.  EVOLUTION OF LEAF FORM IN MARSILEACEOUS FERNS: EVIDENCE FOR HETEROCHRONY , 2009, Evolution; international journal of organic evolution.

[64]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.