Disentangling diatom species complexes: does morphometry suffice?

Accurate taxonomic resolution in light microscopy analyses of microalgae is essential to achieve high quality, comparable results in both floristic analyses and biomonitoring studies. A number of closely related diatom taxa have been detected to date co-occurring within benthic diatom assemblages, sharing many morphological, morphometrical and ecological characteristics. In this contribution, we analysed the hypothesis that, where a large sample size (number of individuals) is available, common morphometrical parameters (valve length, width and stria density) are sufficient to achieve a correct identification to the species level. We focused on some common diatom taxa belonging to the genus Gomphonema. More than 400 valves and frustules were photographed in valve view and measured using Fiji software. Several statistical tools (mixture and discriminant analysis, k-means clustering, classification trees, etc.) were explored to test whether mere morphometry, independently of other valve features, leads to correct identifications, when compared to identifications made by experts. In view of the results obtained, morphometry-based determination in diatom taxonomy is discouraged.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  G. Estabrook,et al.  RECOGNITION OF TAXONOMICALLY SIGNIFICANT CLUSTERS NEAR THE SPECIES LEVEL, USING COMPUTATIONALLY INTENSE METHODS, WITH EXAMPLES FROM THE STEPHANODISCUS NIAGARAE COMPLEX (BACILLARIOPHYCEAE) 1 , 1997 .

[3]  John P. Smol,et al.  The diatoms: applications for the environmental and earth sciences , 2012 .

[4]  R. Jahn,et al.  KOBAYASIELLA SPECIES OF THE CARPATHIAN REGION: MORPHOLOGY, TAXONOMY AND DESCRIPTION OF K. TINTINNUS SPEC. NOV. , 2009 .

[5]  Peter C Austin,et al.  Boosted classification trees result in minor to modest improvement in the accuracy in classifying cardiovascular outcomes compared to conventional classification trees. , 2011, American journal of cardiovascular disease.

[6]  Anna-Maria M. Schmid Aspects of morphogenesis and function of diatom cell walls with implications for taxonomy , 1994 .

[7]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[8]  Zhen Liu,et al.  Link prediction in complex networks: A local naïve Bayes model , 2011, ArXiv.

[9]  P. Hamilton,et al.  MORPHOLOGICAL AND ECOLOGICAL VARIATION WITHIN THE ACHNANTHIDIUM MINUTISSIMUM (BACILLARIOPHYCEAE) SPECIES COMPLEX 1 , 2007 .

[10]  H. Birks,et al.  A comparison of novel and traditional numerical methods for the analysis of modern pollen assemblages from major vegetation–landform types , 2014 .

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Erwin Reichardt Gomphonema gracile Ehrenberg sensu stricto et sensu auct. (Bacillariophyceae): A taxonomic revision , 2015 .

[13]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[14]  Max Kuhn,et al.  caret: Classification and Regression Training , 2015 .

[15]  Saso Dzeroski,et al.  Hierarchical classification of diatom images using ensembles of predictive clustering trees , 2012, Ecol. Informatics.

[16]  Rainer Gersonde,et al.  Morphometric variability in the diatom Fragilariopsis kerguelensis: Implications for Southern Ocean paleoceanography , 2007 .

[17]  H. Lange-Bertalot Diatomeen im Süßwasser-Benthos von Mitteleuropa : Bestimmungsflora Kieselalgen für die ökologische Praxis : Über 700 der häufigsten Arten und ihre Ökologie , 2013 .

[18]  H. D. Buf,et al.  Automatic diatom identification , 2002 .

[19]  Johannes E. Schindelin,et al.  Fiji: an open-source platform for biological-image analysis , 2012, Nature Methods.

[20]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[21]  Janice L. Pappas,et al.  Quantitative shape analysis as a diagnostic and prescriptive tool in determining Fragilariforma (Bacillariophyta) taxon status , 2009 .

[22]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[23]  F. Rimet,et al.  Assessing ecological status with diatoms DNA metabarcoding: Scaling-up on a WFD monitoring network (Mayotte island, France) , 2017 .

[24]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[25]  K. Gajewski,et al.  Numerical analysis of small Arctic diatoms (Bacillariophyceae) representing the Staurosira and Staurosirella species complexes , 2008 .

[26]  Jana Kulichová,et al.  Correspondence Between Morphology and Ecology: Morphological Variation of the Frustulia crassinervia-saxonica Species Complex (Bacillariophyta) Reflects the Ombro-Minerotrophic Gradient , 2016, Cryptogamie, Algologie.

[27]  B. Beszteri,et al.  Conventional and geometric morphometric studies of valve ultrastructural variation in two closely related Cyclotella species (Bacillariophyta) , 2005 .

[28]  Diane M. McKnight,et al.  Automated measurement of diatom size , 2012 .

[29]  David G. Mann,et al.  Biodiversity, biogeography and conservation of diatoms , 1996 .

[30]  E. Theriot AN EMPIRICALLY BASED MODEL OF VARIATION IN ROTATIONAL ELEMENTS IN CENTRIC DIATOMS WITH COMMENTS ON RATIOS IN PHYCOLOGY 1 , 1988 .

[31]  Daiqing Mou,et al.  SEPARATING TABELLARIA (BACILLARIOPHYCEAE) SHAPE GROUPS BASED ON FOURIER DESCRIPTORS 1 , 1992 .

[32]  E. Stoermer,et al.  Morphometric comparison of the neotype of Asterionella formosa Hassall (Heterokontophyta, Bacillariophyceae) with Asterionella edlundii s p. nov. from Lake Hovsgol, Mongolia , 2003 .

[33]  G. Hallegraeff,et al.  The diatom genus Pseudo‐nitzschia (Bacillariophyceae) in New South Wales, Australia: morphotaxonomy, molecular phylogeny, toxicity, and distribution , 2013, Journal of phycology.

[34]  Yan Han,et al.  Hardware/Software Co-Design of a Traffic Sign Recognition System Using Zynq FPGAs , 2015 .

[35]  A. Schmid Aspects of morphogenesis and function of diatom cell walls with implications for taxonomy , 1994, Protoplasma.

[36]  L. Huanxiang,et al.  Support vector machines classification for discriminating coronary heart disease patients from non-coronary heart disease. , 2007, The West Indian medical journal.

[37]  S. Rigatti Random Forest. , 2017, Journal of insurance medicine.

[38]  Michael Kloster,et al.  SHERPA: an image segmentation and outline feature extraction tool for diatoms and other objects , 2014, BMC Bioinformatics.

[39]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .