Application of a self-organizing map to select representative species in multivariate analysis: A case study determining diatom distribution patterns across France

Ecological communities consist of a large number of species. Most species are rare or have low abundance, and only a few are abundant and/or frequent. In quantitative community analysis, abundant species are commonly used to interpret patterns of habitat disturbance or ecosystem degradation. Rare species cause many difficulties in quantitative analysis by introducing noises and bulking datasets, which is worsened by the fact that large datasets suffer from difficulties of data handling. In this study we propose a method to reduce the size of large datasets by selecting the most ecologically representative species using a self organizing map (SOM) and structuring index (SI). As an example, we used diatom community data sampled at 836 sites with 941 species throughout the French hydrosystem. Out of the 941 species, 353 were selected. The selected dataset was effectively classified according to the similarities of community assemblages in the SOM map. Compared to the SOM map generated with the original dataset, the community pattern gave a very similar representation of ecological conditions of the sampling sites, displaying clear gradients of environmental factors between different clusters. Our results showed that this computational technique can be applied to preprocessing data in multivariate analysis. It could be useful for ecosystem assessment and management, helping to reduce both the list of species for identification and the size of datasets to be processed for diagnosing the ecological status of water courses.

[1]  Young-Seuk Park,et al.  Predicting the species richness of aquatic insects in streams using a limited number of environmental variables , 2003, Journal of the North American Benthological Society.

[2]  Miguel Á. Carreira-Perpiñán,et al.  Continuous latent variable models for dimensionality reduction and sequential data reconstruction , 2001 .

[3]  S. Lek,et al.  Applications of artificial neural networks for patterning and predicting aquatic insect species richness in running waters , 2003 .

[5]  David P. Larsen,et al.  Rare species in multivariate analysis for bioassessment: some considerations , 2001, Journal of the North American Benthological Society.

[6]  Teuvo Kohonen,et al.  Self-Organizing Maps, Third Edition , 2001, Springer Series in Information Sciences.

[7]  Michael Obach,et al.  Modelling population dynamics of aquatic insects with artificial neural networks , 2001 .

[8]  Young-Seuk Park,et al.  Conservation Strategies for Endemic Fish Species Threatened by the Three Gorges Dam , 2003 .

[9]  Friedrich Recknagel,et al.  Ecological Informatics: Understanding Ecology by Biologically-Inspired Computation , 2003 .

[10]  James H. Brown Two Decades of Homage to Santa Rosalia: Toward a General Theory of Diversity , 1981 .

[11]  G. Minshall,et al.  Regional patterns in periphyton accrual and diatom assemblage structure in a heterogeneous nutrient landscape , 2002 .

[12]  Charles E. McCulloch,et al.  MULTIVARIATE ANALYSIS IN ECOLOGY AND SYSTEMATICS: PANACEA OR PANDORA'S BOX? , 1990 .

[13]  Chandrika Kamath,et al.  Dimension reduction techniques and the classification of bent double galaxies , 2002, Comput. Stat. Data Anal..

[14]  Sovan Lek,et al.  Artificial neural networks as a tool in ecological modelling, an introduction , 1999 .

[15]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[16]  B. McCune,et al.  Analysis of Ecological Communities , 2002 .

[17]  R. Stevenson Scale-Dependent Determinants and Consequences of Benthic Algal Heterogeneity , 1997, Journal of the North American Benthological Society.

[18]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[19]  W. J. Walley,et al.  Self-Organising Maps for the Classification and Diagnosis of River Quality from Biological and Environmental Data , 1999, ISESS.

[20]  Manté,et al.  Statistical method for selecting representative species in multivariate analysis of long-term changes of marine communities. Applications to a macrobenthic community from the Bay of Morlaix , 1995 .

[21]  Young-Seuk Park,et al.  Patternizing communities by using an artificial neural network , 1996 .

[22]  R. Denzer Environmental Software Systems: Environmental Information and Decision Support, IFIP TC5 WG5.11 3rd International Symposium on Environmental Software Systems (ISESS'99), August 30 - September 2, 1999, Dunedin, New Zealand , 2000, ISESS.

[23]  C. Manté,et al.  Analyse de l'évolution temporelle de communautés macrobenthiques à partir des probabilités de présence des espèces , 1997 .

[24]  Young-Seuk Park,et al.  Use of unsupervised neural networks for ecoregional zoning of hydrosystems through diatom communities: case study of Adour-Garonne watershed (France) , 2004 .

[25]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[26]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[27]  Y-S Park,et al.  Typology of diatom communities and the influence of hydro-ecoregions: a study on the French hydrosystem scale. , 2005, Water research.

[28]  R. Céréghino,et al.  Spatial analysis of stream invertebrates distribution in the Adour-Garonne drainage basin (France), using Kohonen self organizing maps , 2001 .

[29]  F. W. Preston The Canonical Distribution of Commonness and Rarity: Part I , 1962 .

[30]  S. Lek,et al.  The use of artificial neural networks to assess fish abundance and spatial occupancy in the littoral zone of a mesotrophic lake , 1999 .

[31]  I. Dimopoulos,et al.  Application of neural networks to modelling nonlinear relationships in ecology , 1996 .

[32]  Juha Vesanto,et al.  Neural Network Tool for Data Mining: SOM Toolbox , 2000 .