Source Apportionment of Water Pollution in the Jinjiang River (China) Using Factor Analysis With Nonnegative Constraints and Support Vector Machines

Source apportionment studies of water pollution can greatly improve the knowledge of the human impact on the aquatic environment. Factor analysis (FA) has been widely used to identify sources of water pollution because of its relative ease of implementation. Generally, the method of identifying the sources was by qualitatively comparing source emission characteristics with factor loadings derived from FA. However, this traditional method was somewhat coarse to express the nonlinear relationship between source emission characteristics and factor loadings. In this study, by treating source identification using source emission characteristics and factor loadings as a pattern recognition problem, a source apportionment method was proposed by combining the factor analysis with nonnegative constraints (FA-NNC) with the support vector machine (SVM). Data sets on water quality of the Jinjiang River (China), which were sampled between May 2009 and September 2010 at 13 sites, have been collected to evaluate this proposed method. The apportionment results showed that the identified sources using the combined models were similar to the comprehensive analysis results obtained from qualitatively comparing source emission characteristics with factor loadings. Industrial activities, including papermaking and textiles, metal handicrafts manufacture, chemical and metal producing, metal refining and iron ore mining were identified as the main pollution sources with contribution ratio of 79.58%, followed by agricultural non-point sources (20.42%). These results provide policy and decision makers with a useful help for supporting the management of water pollution in the Jinjiang River. Meanwhile, this study will provide a useful direction for developing source apportionment approach to support the management of water pollution.

[1]  H. Boyacıoğlu,et al.  Water pollution sources assessment by multivariate statistical methods in the Tahtali Basin, Turkey , 2008 .

[2]  G. Norris,et al.  Evaluation of the CMB and PMF models using organic molecular markers in fine particulate matter collected during the Pittsburgh Air Quality Study , 2008 .

[3]  P. Scheff,et al.  Application of EPA CMB8.2 model for source apportionment of sediment PAHs in Lake Calumet, Chicago. , 2003, Environmental science & technology.

[4]  E. R. Christensen,et al.  PAHs in sediments of the Black River and the Ashtabula River, Ohio: source apportionment by factor analysis. , 2005, Water research.

[5]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[6]  E. R. Christensen,et al.  Historical PAH fluxes to Lake Michigan sediments determined by factor analysis , 1998 .

[7]  E. R. Christensen,et al.  Source apportionment of PAHs in sediments using factor analysis by time records: application to Lake Michigan, USA. , 2007, Water research.

[8]  K. Lee,et al.  A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer , 2011 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  A. Malik,et al.  WATER QUALITY ASSESSMENT AND APPORTIONMENT OF POLLUTION SOURCES OF GOMTI RIVER(INDIA) USING MULTIVARIATE STATISTICAL TECHNIQUES- A CASE STUDY , 2005 .

[11]  Hendrik Blockeel,et al.  Efficient Algorithms for Decision Tree Cross-validation , 2001, J. Mach. Learn. Res..

[12]  R. Owen,et al.  Heavy Metal Accumulation and Anthropogenic Impacts on Tolo Harbour, Hong Kong , 2000 .

[13]  Xianliang Qiao,et al.  Distribution and sources of polycyclic aromatic hydrocarbons from urban to rural soils: a case study in Dalian, China. , 2007, Chemosphere.

[14]  Vladimir Vapnik Introduction: Four Periods in the Research of the Learning Problem , 1995 .

[15]  S. Sathiya Keerthi,et al.  Improvements to the SMO algorithm for SVM regression , 2000, IEEE Trans. Neural Networks Learn. Syst..

[16]  A. V.DavidSánchez,et al.  Advanced support vector machines and kernel methods , 2003, Neurocomputing.

[17]  Kai Li,et al.  Modeling polychlorinated biphenyl congener patterns and dechlorination in dated sediments from the Ashtabula River, Ohio, USA , 2002, Environmental toxicology and chemistry.

[18]  Graeme C. Wake,et al.  A mathematical model for pollution in a river and its remediation by aeration , 2009, Appl. Math. Lett..

[19]  Yiguang Liu,et al.  A novel and quick SVM-based multi-class classifier , 2006, Pattern Recognit..

[20]  G. Wahba,et al.  Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .

[21]  Dejan Gjorgjevikj,et al.  A Multi-class SVM Classifier Utilizing Binary Decision Tree , 2009, Informatica.

[22]  Yuanhui Zhao,et al.  Heavy metal pollution in intertidal sediments from Quanzhou Bay, China. , 2008, Journal of environmental sciences.

[23]  G H Huang,et al.  Barriers to sustainable water-quality management. , 2001, Journal of environmental management.

[24]  E. R. Christensen,et al.  Source apportionment of sediment PAHs in Lake Calumet, Chicago: application of factor analysis with nonnegative constraints. , 2004, Environmental science & technology.

[25]  Jingwen Chen,et al.  Source identification of PCDD/Fs and PCBs in pine (Cedrus deodara) needles: A case study in Dalian, China , 2008 .

[26]  N. Ogawa,et al.  Study of pollutants in precipitation (rain and snow) transported long distance to west coasts of Japan Islands using oblique rotational factor analysis with partially non-negative constraint , 2006 .

[27]  G. Gordon,et al.  Receptor models. , 1988, Environmental science & technology.

[28]  Oliver Buck,et al.  Scale-dependence of land use effects on water quality of streams in agricultural catchments. , 2004, Environmental pollution.