The contribution of cluster and discriminant analysis to the classification of complex aquifer systems

This paper presents an innovated method for the discrimination of groundwater samples in common groups representing the hydrogeological units from where they have been pumped. This method proved very efficient even in areas with complex hydrogeological regimes. The proposed method requires chemical analyses of water samples only for major ions, meaning that it is applicable to most of cases worldwide. Another benefit of the method is that it gives a further insight of the aquifer hydrogeochemistry as it provides the ions that are responsible for the discrimination of the group. The procedure begins with cluster analysis of the dataset in order to classify the samples in the corresponding hydrogeological unit. The feasibility of the method is proven from the fact that the samples of volcanic origin were separated into two different clusters, namely the lava units and the pyroclastic–ignimbritic aquifer. The second step is the discriminant analysis of the data which provides the functions that distinguish the groups from each other and the most significant variables that define the hydrochemical composition of the aquifer. The whole procedure was highly successful as the 94.7 % of the samples were classified to the correct aquifer system. Finally, the resulted functions can be safely used to categorize samples of either unknown or doubtful origin improving thus the quality and the size of existing hydrochemical databases.

[1]  József Kovács,et al.  Classification into homogeneous groups using combined cluster and discriminant analysis , 2014, Environ. Model. Softw..

[2]  Hui Liu,et al.  Analysis of spatial and temporal water pollution patterns in Lake Dianchi using multivariate statistical methods , 2010, Environmental monitoring and assessment.

[3]  Gerasimoula Demopoulou,et al.  A long-term study of temporal hydrochemical data in a shallow lake using multivariate statistical techniques , 2006 .

[4]  B. Momen,et al.  WATERSHED CLASSIFICATION BY DISCRIMINANT ANALYSES OF LAKEWATER‐CHEMISTRY AND TERRESTRIAL CHARACTERISTICS , 1998 .

[5]  C. K. Mukherjee,et al.  Assessment of water quality using multivariate statistical techniques in the coastal region of Visakhapatnam, India , 2014, Environmental Monitoring and Assessment.

[6]  Vijay P. Singh,et al.  Determining the interaction between groundwater and saline water through groundwater major ions chemistry , 2010 .

[7]  B. Muys,et al.  Comparison and ranking of different modelling techniques for prediction of site index in Mediterranean mountain forests , 2010 .

[8]  Gwo-Fong Lin,et al.  Performing cluster analysis and discrimination analysis of hydrological factors in one step , 2006 .

[9]  Charles E. Brown Applied Multivariate Statistics in Geohydrology and Related Sciences , 1998 .

[10]  A. E. Greenberg,et al.  Standard methods for the examination of water and wastewater : supplement to the sixteenth edition , 1988 .

[11]  I. Gibson Statistics and Data Analysis in Geology , 1976, Mineralogical Magazine.

[12]  C. Okogbue,et al.  Characterization of groundwater quality in three settlement areas of Enugu metropolis, southeastern Nigeria, using multivariate analysis , 2014, Environmental Monitoring and Assessment.

[13]  Peter Filzmoser,et al.  Discriminant analysis for compositional data and robust parameter estimation , 2012, Comput. Stat..

[14]  J. L. García-Aróstegui,et al.  Identifying the origin of groundwater samples in a multi-layer aquifer system with Random Forest classification , 2013 .

[15]  A. E. Kelepertsis Hydrothermal alteration of basic island‐arc volcanic rocks north and south of Mytilini Town, Lesvos Island, Greece , 1993 .

[16]  Yong Zhang,et al.  Multivariate statistical analysis of water chemistry in evaluating groundwater geochemical evolution and aquifer connectivity near a large coal mine, Anhui, China , 2016, Environmental Earth Sciences.

[17]  Qishan Wang,et al.  Application of multivariate statistical techniques in the assessment of water quality in the Southwest New Territories and Kowloon, Hong Kong , 2011, Environmental monitoring and assessment.

[18]  N. Mondal,et al.  Assessment of seawater impact using major hydrochemical ions: a case study from Sadras, Tamilnadu, India , 2011, Environmental monitoring and assessment.

[19]  G. Panagopoulos,et al.  Application of multivariate statistical procedures to the hydrochemical study of a coastal aquifer: an example from Crete, Greece , 2007 .

[20]  Seyed Amir Naghibi,et al.  GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran , 2015, Environmental Monitoring and Assessment.

[21]  A. Ramachandra Rao,et al.  Regionalization of watersheds by fuzzy cluster analysis , 2006 .

[22]  Solt Kovács,et al.  Optimization of the monitoring network on the River Tisza (Central Europe, Hungary) using combined cluster and discriminant analysis, taking seasonality into account , 2015, Environmental Monitoring and Assessment.

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Trung Hieu Nguyen,et al.  Temporal and spatial assessment of river surface water quality using multivariate statistical techniques: a study in Can Tho City, a Mekong Delta area, Vietnam , 2015, Environmental Monitoring and Assessment.

[25]  Chaofan Guo,et al.  Multivariate statistical analysis of temporal–spatial variations in water quality of a constructed wetland purification system in a typical park in Beijing, China , 2014, Environmental Monitoring and Assessment.

[26]  K. P. Anagnostopoulos,et al.  Application of Stepwise Discriminant Analysis for the Identification of Salinity Sources of Groundwater , 2006 .

[27]  Joseph Asante,et al.  A New Approach to Identify Recharge Areas in the Lower Virgin River Basin and Surrounding Basins by Multivariate Statistics , 2015, Mathematical Geosciences.

[28]  M. K. Yusoff,et al.  Multivariate statistical techniques for the assessment of seasonal variations in surface water quality of pasture ecosystems , 2013, Environmental Monitoring and Assessment.

[29]  A. Godelitsas,et al.  Multivariate statistical analysis of the hydrogeochemical and isotopic composition of the groundwater resources in northeastern Peloponnesus (Greece). , 2014, The Science of the total environment.

[30]  L. Sun,et al.  Statistical analysis of hydrochemistry of groundwater and its implications for water source identification: a case study , 2014, Arabian Journal of Geosciences.

[31]  H. Arslan,et al.  Application of multivariate statistical techniques in the assessment of groundwater quality in seawater intrusion area in Bafra Plain, Turkey , 2013, Environmental Monitoring and Assessment.

[32]  G. Pe‐Piper,et al.  Geochemical variation with time in the Cenozoic high-K volcanic rocks of the island of Lesbos, Greece: significance for shoshonite petrogenesis , 1992 .

[33]  S. Shrestha,et al.  Assessment of surface water quality using multivariate statistical techniques: case study of the Nampong River and Songkhram River, Thailand , 2015, Environmental Monitoring and Assessment.

[34]  V. Clark,et al.  Computer-aided multivariate analysis , 1991 .

[35]  G. Panagopoulos,et al.  The use of multicomponent statistical analysis in hydrogeological environmental research. , 2004, Water research.

[36]  A. Ramachandra Rao,et al.  Regionalization of watersheds by hybrid-cluster analysis , 2006 .