Are data-mining techniques useful for selecting ecological indicators in biodiverse regions? Bridges between market basket analysis and indicator value analysis from a case study in the neotropics

Abstract Ecological monitoring research relies heavily on signals to detect ecosystem changes, making the selection of indicators a crucial methodological requirement. Over the years, individual species and species assemblages have been widely used, thereby, giving rise to reference methods that support the detection of ecological indicators. One such method, the Indicator Value Analysis (IndVal), has been adapted to identify not only species but also combinations of species, assuming collective responses to environmental factors. However, the IndVal method requires a pre-selection of species before performing the analysis, especially in the case of large datasets (e.g. high species richness), when it becomes ineffective. Species pre-selection might introduce subjectivity and a bias into the database, which can cause possible impacts on the final set of indicators. To address these issues, the authors propose the use of Market Basket Analysis (MBA) – a data mining method – which is mathematically similar to IndVal but designed to handle large amounts of data. Both methods were applied to select indicators from gradually larger datasets of Soil Surface Dwelling Arthropods from the Brazilian Amazon, using threshold-dependent indices to assess concordance between results. In general, the results obtained by applying both methods were found to be similar, with an average Jaccard's distance of 0.432 (±0.346) and an average True Skill Statistic of 0.991 (±0.012). As expected, MBA was able to select ecological indicators without species pre-selection as well as from datasets where IndVal had been unsuccessful. In such cases, and by means of objective association rules, the authors demonstrate that MBA could be used to pre-select ecological indicators, which can then be further processed and summarized with the IndVal method. In this study, the authors briefly outline the potential of MBA to complement IndVal and discuss advantages and disadvantages of using MBA for ecological indicators (pre-) selection.

[1]  Virginia H. Dale,et al.  Challenges in the development and use of ecological indicators , 2001 .

[2]  A. Goodenough,et al.  Questioning the reliability of “ancient” woodland indicators: Resilience to interruptions and persistence following deforestation , 2018 .

[3]  Brett R. Scheffers,et al.  Biodiversity redistribution under climate change: Impacts on ecosystems and human well-being , 2017, Science.

[4]  Matthew K. Lau,et al.  How do ecologists select and use indicator species to monitor ecological change? Insights from 14 years of publication in Ecological Indicators , 2016 .

[5]  T. Pearson,et al.  Objective Selection of Sensitive Species Indicative of Pollution-Induced Change in Benthic Communities. I. Comparative Methodology , 1982 .

[6]  Miquel De Cáceres,et al.  Improving indicator species analysis by combining groups of sites , 2010 .

[7]  R. Pearson,et al.  Predicting species distributions from small numbers of occurrence records: A test case using cryptic geckos in Madagascar , 2006 .

[8]  P. Legendre,et al.  Associations between species and groups of sites: indices and statistical inference. , 2009, Ecology.

[9]  H. Ferris,et al.  Nematode community structure as a bioindicator in environmental monitoring. , 1999, Trends in ecology & evolution.

[10]  Nadia Papadopoulou,et al.  Tales from a thousand and one ways to integrate marine ecosystem components when assessing the environmental status , 2014, Front. Mar. Sci..

[11]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[12]  A. Zenetos,et al.  Benthic indicators to use in Ecological Quality classification of Mediterranean soft bottom marine ecosystems, including a new Biotic Index , 2002 .

[13]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[14]  J. W. Thomas,et al.  Ecological Uses of Vertebrate Indicator Species: A Critique , 1988 .

[15]  Owen L. Petchey,et al.  Biodiversity and Resilience of Ecosystem Functions. , 2015, Trends in ecology & evolution.

[16]  Katerina Pramatari,et al.  Retail business analytics: Customer visit segmentation using market basket data , 2018, Expert Syst. Appl..

[17]  J. Good,et al.  The effects of conifer forest design and management on abundance and diversity of rove beetles (Coleoptera: Staphylinidae): implications for conservation , 1993 .

[18]  M. Zettler,et al.  On the Myths of Indicator Species: Issues and Further Consideration in the Use of Static Concepts for Ecological Applications , 2013, PloS one.

[19]  P. Legendre,et al.  SPECIES ASSEMBLAGES AND INDICATOR SPECIES:THE NEED FOR A FLEXIBLE ASYMMETRICAL APPROACH , 1997 .

[20]  Miquel De Cáceres,et al.  Using species combinations in indicator value analyses , 2012 .

[21]  F. James Rohlf,et al.  Biometry: The Principles and Practice of Statistics in Biological Research , 1969 .

[22]  James R. Karr,et al.  Ecological perspective on water quality goals , 1981 .

[23]  Kirstin K. Holsman,et al.  Ecosystem considerations in Alaska: the value of qualitative assessments , 2017 .

[24]  D. W. Zimmerman,et al.  Rank Transformations and the Power of the Student T Test and Welch T' Test for Non-Normal Populations with Unequal Variances , 1993 .

[25]  Gerald J. Niemi,et al.  Application of Ecological Indicators , 2004 .

[26]  Sam Droege,et al.  A Case for Using Plethodontid Salamanders for Monitoring Biodiversity and Ecosystem Integrity of North American Forests , 2001 .

[27]  A. Beintema Meadow birds as indicators , 1983, Environmental monitoring and assessment.

[28]  H. Ellenberg Indicator values of vascular plants in central Europe. , 1974 .

[29]  T. Dawson,et al.  Selecting thresholds of occurrence in the prediction of species distributions , 2005 .

[30]  M. Chytrý,et al.  Statistical determination of diagnostic species for site groups of unequal size , 2006 .

[31]  Rudolf de Groot,et al.  A conceptual framework for selecting environmental indicator sets , 2008 .

[32]  C. Bost,et al.  Seabirds as bio-indicators of changing marine ecosystems: new perspectives , 1993 .

[33]  Helen M Regan,et al.  Global change and terrestrial plant community dynamics , 2016, Proceedings of the National Academy of Sciences.

[34]  Adalberto J. Santos,et al.  Selecting terrestrial arthropods as indicators of small-scale disturbance: A first approach in the Brazilian Atlantic Forest , 2009 .

[35]  Ermias T. Azeria,et al.  Robust predictive performance of indicator species despite different co-occurrence patterns of birds in natural and managed boreal forests , 2017 .

[36]  H. Dean The use of polychaetes (Annelida) as indicator species of marine pollution: a review , 2008 .

[37]  R. F. Doren,et al.  Ecological indicators for system-wide assessment of the greater everglades ecosystem restoration program , 2009 .

[38]  R. Cajaiba MORCEGOS (MAMMALIA, CHIROPTERA) EM CAVERNAS NO MUNICÍPIO DE URUARÁ, PARÁ, NORTE DO BRASIL , 2014 .

[39]  Claire Kremen,et al.  Assessing the Indicator Properties of Species Assemblages for Natural Areas Monitoring. , 1992, Ecological applications : a publication of the Ecological Society of America.

[40]  A.M.T. Bongers,et al.  Interpretation of disturbance-induced maturity decrease in marine nematode assemblages by means of the Maturity Index. , 1991 .

[41]  D. Phillips The use of biological indicator organisms to monitor trace metal pollution in marine and estuarine environments—a review , 1977 .

[42]  R. Cajaiba,et al.  INVENTÁRIO DE ARANEOFAUNA (ARACHNIDA, ARANEAE) COLETADAS EM PASTAGENS NO MUNICÍPIO DE URUARÁ, PARÁ, BRASIL , 2014 .

[43]  János Podani,et al.  Detecting indicator species: Some extensions of the IndVal measure , 2010 .

[44]  J. Vanclay Site Productivity Assessment in Rainforests: An Objective Approach Using Indicator Species , 1989 .

[45]  O. A. Sæther Chironomid communities as water quality indicators , 1979 .

[46]  Benjamin Burkhard,et al.  Interactions of ecosystem properties, ecosystem integrity and ecosystem service indicators—A theoretical matrix exercise , 2013 .

[47]  D. Bell,et al.  Early indicators of change: divergent climate envelopes between tree life stages imply range shifts in the western United States , 2014 .

[48]  D. Pearson,et al.  Selecting indicator taxa for the quantitative assessment of biodiversity. , 1994, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[49]  João Alexandre Cabral,et al.  Development of a stochastic dynamic model for ecological indicators’ prediction in changed Mediterranean agroecosystems of north-eastern Portugal. , 2004 .

[50]  Jonathan Majer,et al.  Ants show the way Down Under: invertebrates as bioindicators in land management , 2004 .

[51]  T. Pakkala,et al.  Indicators of Forest Biodiversity: Which Bird Species Predict High Breeding Bird Assemblage Diversity in Boreal Forests at Multiple Spatial Scales? , 2014 .

[52]  Saumitra N. Bhaduri,et al.  Mitigating Sample Selection Bias Through Customer Relationship Management , 2016 .

[53]  M. Diekmann Species indicator values as an important tool in applied plant ecology – a review , 2003 .

[54]  Robert V. O'Neill,et al.  Considerations for the development of a terrestrial index of ecological integrity , 2001 .

[55]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[56]  C. Ricotta,et al.  Let the concept of indicator species be functional , 2015 .

[57]  Uta Berger,et al.  Linking landscape futures with biodiversity conservation strategies in northwest Iberia - A simulation study combining surrogates with a spatio-temporal modelling approach , 2016, Ecol. Informatics.

[58]  R. Berk An introduction to sample selection bias in sociological data. , 1983 .

[59]  Milan Chytrý,et al.  Determination of diagnostic species with statistical fidelity measures , 2002 .

[60]  David B. Lindenmayer,et al.  Direct Measurement Versus Surrogate Indicator Species for Evaluating Environmental Change and Biodiversity Loss , 2010, Ecosystems.

[61]  B. McCune,et al.  Lichen Communities as Indicators of Forest Health , 2000 .

[62]  Francisco Moreira,et al.  Impacts of land use and infrastructural changes on threatened Little Bustard Tetrax tetrax breeding populations: quantitative assessments using a recently developed spatially explicit dynamic modelling framework , 2016, Bird Conservation International.

[63]  J. Grinnell,et al.  Life-Zone Indicators in California , 1920 .

[64]  C. Winchell,et al.  Effects of habitat quality and wildfire on occupancy dynamics of Coastal California Gnatcatcher (Polioptila californica californica) , 2014 .

[65]  J. Louzada,et al.  Dung beetles as indicators of tropical forest restoration success: Is it possible to recover species and functional diversity? , 2014 .

[66]  J. A. Cabral,et al.  How informative is the response of Ground Beetles' (Coleoptera: Carabidae) assemblages to anthropogenic land use changes? Insights for ecological status assessments from a case study in the Neotropics. , 2018, The Science of the total environment.

[67]  M. Santos,et al.  Are Small Dung Beetles (Aphodiinae) useful for monitoring neotropical forests’ ecological status? Lessons from a preliminary case study in the Brazilian Amazon , 2018, Forest Ecology and Management.

[68]  Andrew J. Tyre,et al.  Application of detectability in the use of indicator species: A case study with birds , 2011 .

[69]  R. Breen,et al.  Heterogeneous causal effects and sample selection bias , 2015 .

[70]  D. Bourke,et al.  Projected Range Contractions of European Protected Oceanic Montane Plant Communities: Focus on Climate Change Impacts Is Essential for Their Future Conservation , 2014, PloS one.

[71]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[72]  R. G. Davies,et al.  Dung beetles as indicators for rapid impact assessments: Evaluating best practice forestry in the neotropics , 2014 .

[73]  C. Scholtz,et al.  Scarabaeine dung beetles as indicators of biodiversity, habitat transformation and pest control chemicals in agro-ecosystems , 2004 .

[74]  M. Côte,et al.  Resilience thinking meets social theory , 2012 .

[75]  Xavier Pons,et al.  Reassessing global change research priorities in mediterranean terrestrial ecosystems: how far have we come and where do we go from here? , 2015 .

[76]  J. A. Cabral,et al.  A Minimal Invasive Method to Forecast the Effects of Anthropogenic Disturbance on Tropical Cave Beetle Communities , 2016, Neotropical Entomology.

[77]  Vincent Carignan,et al.  Selecting Indicator Species to Monitor Ecological Integrity: A Review , 2002, Environmental monitoring and assessment.

[78]  Donald A. Jackson,et al.  Reconstructing community relationships: the impact of sampling error, ordination approach, and gradient length , 2007 .

[79]  Jean-Louis Martin,et al.  Species indicators of ecosystem recovery after reducing large herbivore density : comparing taxa and testing species combinations , 2014 .

[80]  D. Bradford,et al.  Bird Communities and Habitat as Ecological Indicators of Forest Condition in Regional Monitoring , 2000 .

[81]  G. Méthot,et al.  Macroinvertebrate community as a biological indicator of ecological and toxicological factors in Lake Saint-François (Québec). , 1996, Environmental pollution.

[82]  D. Rubinoff,et al.  Evaluating the California Gnatcatcher as an Umbrella Species for Conservation of Southern California Coastal Sage Scrub , 2001 .

[83]  J. A. Cabral,et al.  Does the composition of Scarabaeidae (Coleoptera) communities reflect the extent of land use changes in the Brazilian Amazon , 2017 .