A methodology for the fast identification and monitoring of microplastics in environmental samples using random decision forest classifiers

A new yet little understood threat to our ecosystems is microplastics. These microscopic particles accumulate in our oceans and in the end may find their way into the food chain. Even though their origin and the laws governing their formation have become ever more clear fast and reliable methodologies for their analysis and identification are still lacking or at an early stage of development. The first automatic approaches to analyze μFTIR images of microplastics which have been enriched on membrane filters are promising and provide the impetus to put further effort into their development. In this paper we present a methodology which allows discrimination between different polymer types and measurement of their abundance and their size distributions with high accuracy. In particular we apply random decision forest classifiers and compute a multiclass model for the polymers polyethylene, polypropylene, poly(methyl methacrylate), polyacrylonitrile and polystyrene. Further classification results of the analyzed μFTIR images are given for comparability. The study also briefly discusses common issues that can arise in classification such as the curse of dimensionality and label noise.

[1]  Albert Fornells,et al.  A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[2]  Svenja Mintenig,et al.  Focal plane array detector-based micro-Fourier-transform infrared imaging for the analysis of microplastics in environmental samples , 2015 .

[3]  Richard C. Thompson,et al.  Microplastics in freshwater systems: a review of the emerging threats, identification of knowledge gaps and prioritisation of research needs. , 2015, Water research.

[4]  P. Kay,et al.  Wastewater treatment plants as a source of microplastics in river catchments , 2018, Environmental Science and Pollution Research.

[5]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[6]  Gunnar Gerdts,et al.  Reference database design for the automated analysis of microplastic samples based on Fourier transform infrared (FTIR) spectroscopy , 2018, Analytical and Bioanalytical Chemistry.

[7]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[8]  Maryam Imani,et al.  Band Clustering-Based Feature Extraction for Classification of Hyperspectral Images Using Limited Training Samples , 2014, IEEE Geoscience and Remote Sensing Letters.

[9]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[10]  Svenja Mintenig,et al.  Enzymatic Purification of Microplastics in Environmental Samples. , 2017, Environmental science & technology.

[11]  Erwan Scornet,et al.  A random forest guided tour , 2015, TEST.

[12]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[13]  Gunnar Gerdts,et al.  An automated approach for microplastics analysis using focal plane array (FPA) FTIR microscopy and image analysis , 2017 .

[14]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[15]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[16]  Giorgia Foca,et al.  Fast exploration and classification of large hyperspectral image datasets for early bruise detection on apples , 2015 .

[17]  Richard C. Thompson,et al.  Microplastics in the marine environment: a review of the methods used for identification and quantification. , 2012, Environmental science & technology.

[18]  Paul Dumas,et al.  Resonant Mie scattering in infrared spectroscopy of biological materials--understanding the 'dispersion artefact'. , 2009, The Analyst.

[19]  M. Verleysen,et al.  Classification in the Presence of Label Noise: A Survey , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[21]  Jon Atli Benediktsson,et al.  Recent Advances in Techniques for Hyperspectral Image Processing , 2009 .

[22]  Gerald Schernewski,et al.  Analysis of environmental microplastics by vibrational microspectroscopy: FTIR, Raman or both? , 2016, Analytical and Bioanalytical Chemistry.

[23]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Hans Lohninger,et al.  Chemometric analysis of multisensor hyperspectral images of precipitated atmospheric particulate matter. , 2015, Analytical chemistry.

[25]  T. Schmidt,et al.  A New Chemometric Approach for Automatic Identification of Microplastics from Environmental Compartments Based on FT-IR Spectroscopy. , 2017, Analytical chemistry.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[28]  Taghi M. Khoshgoftaar,et al.  Identifying learners robust to low quality data , 2008, 2008 IEEE International Conference on Information Reuse and Integration.