Application of classification methods to analyze chemicals in drinking water quality

To analyze drinking water dataset, various statistical methods have been applied, including discriminant analysis, logistic regression and cluster analysis, to construct models for the identification of important input variables. Among them decision trees are more flexible than other statistical classification methods because it provides us a complete path or frame to reach a specific decision with simplicity and ease of understanding about critical variables. This article describes the application of classification decision trees for the analysis of drinking water quality affecting variables and includes discussion about these based on various methods as well as their comparison to reach the best approach for the further analysis about understudy area. In this study, samples of filtered water are taken from 100 pumps located in different union councils of the Lahore city. The classification trees are constructed on the basis of input quality variables, and the results are reported in the form of confusion matrix. Four techniques, including Chi-square Automatic Interaction Detector, Exhaustive Chi-square Automatic Interaction Detector, Classification and Regression Tree and Quick Unbiased Efficient Statistical Tree, were used. Three experiments were conducted to get performance evaluation of the models by the number of misclassified units. The first method used complete dataset, the second one is based on the cross-validation, while the last one is based on the random subsampling.

[1]  Mark J. Eisenberg,et al.  Comparison of the mineral content of tap water and bottled waters , 2001, Journal of general internal medicine.

[2]  J. Morgan,et al.  Problems in the Analysis of Survey Data, and a Proposal , 1963 .

[3]  J. Morgan,et al.  Thaid a Sequential Analysis Program for the Analysis of Nominal Scale Dependent Variables , 1973 .

[4]  L. Baroni,et al.  Evaluating the environmental impact of various dietary patterns combined with different food production systems , 2007, European Journal of Clinical Nutrition.

[5]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[6]  Achim Zeileis,et al.  A Toolkit for Recursive Partytioning , 2015 .

[7]  Dominic L Boccelli,et al.  Development of a neural-based forecasting tool to classify recreational water quality using fecal indicator organisms. , 2012, Water research.

[8]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[9]  S. Gupta,et al.  A Comparison of Water Quality Indices for Coastal Water , 2003, Journal of environmental science and health. Part A, Toxic/hazardous substances & environmental engineering.

[10]  Mohamad Sakizadeh,et al.  Assessment the performance of classification methods in water quality studies, A case study in Karaj River , 2015, Environmental Monitoring and Assessment.

[11]  Raymond Chawla,et al.  Classification of bathing water quality based on the parametric calculation of percentiles is unsound. , 2005, Water research.

[12]  Muhammad Aslam,et al.  Comparisons of decision tree methods using water data , 2017, Commun. Stat. Simul. Comput..

[13]  Dominique Salameh,et al.  Short-term relationships between emergency hospital admissions for respiratory and cardiovascular diseases and fine particulate air pollution in Beirut, Lebanon , 2015, Environmental Monitoring and Assessment.

[14]  Kellie J Archer,et al.  rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response. , 2010, Journal of statistical software.

[15]  Angel R. Martinez,et al.  Computational Statistics Handbook with MATLAB , 2001 .

[16]  Che-Chern Lin,et al.  Implementation of classifiers for choosing insurance policy using decision trees: a case study , 2008 .

[17]  Herbert Hoijtink,et al.  A Fortran 90 Program for Confirmatory Analysis of Variance , 2010 .

[18]  P Barbieri,et al.  Comparison of self-organizing maps classification approach with cluster and principal components analysis for large environmental data sets. , 2007, Water research.