PREDICTIVE DATA MINING TECHNIQUES FOR MANAGEMENT OF HIGH DIMENSIONAL BIG-DATA

Data mining is a technique, wherein the historical data is explored in search of a systematic relationship between variables and/or have a consistent pattern. This relationship is utilized to validate the outcomes by applying the identified patterns onto new data subsets. This paper compares three predictive data-mining techniques, namelymultiple linear regression, principal component regression and the partial least squares ona unique dataset. This data is unique, having a characteristics combination of presence of outliers, highly collinear variables,very redundant variables and predictor variables. In the initial step after pre-preparing information, negligible number of factors are chosen that can totally anticipate the reaction variable. These diverse information mining strategies, which has distinctive use techniques were actualized on the total informational index and the best strategy in every procedure was distinguished and this is utilized for worldwide examination with different systems for similar information.

[1]  R. Karri Evaluating and Estimating the Complex Dynamic Phenomena in Nonlinear Chemical Systems , 2011 .

[2]  Ch. Venkateswarlu,et al.  Mathematical and kinetic modeling of biofilm reactor based on ant colony optimization , 2010 .

[3]  Geoffrey I. Webb,et al.  Advances in Knowledge Discovery and Data Mining , 2018, Lecture Notes in Computer Science.

[4]  D. Edwards Data Mining: Concepts, Models, Methods, and Algorithms , 2003 .

[5]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[6]  Luís Torgo,et al.  Data Mining with R: Learning with Case Studies , 2010 .

[7]  A. L. Kidd,et al.  Knowledge acquisition for expert systems: a practical handbook , 1987 .

[8]  Zahid Anwar,et al.  Data mining techniques and applications — A decade review , 2017, 2017 23rd International Conference on Automation and Computing (ICAC).

[9]  Shu-Hsien Liao,et al.  Data mining techniques and applications - A decade review from 2000 to 2011 , 2012, Expert Syst. Appl..

[10]  Vladan Babovic,et al.  Application of data assimilation for improving forecast of water levels and residual currents in Singapore regional waters , 2012, Ocean Dynamics.

[11]  Paolo Giudici,et al.  Applied Data Mining: Statistical Methods for Business and Industry , 2003 .

[12]  R. Karri,et al.  Influence of fluid and operating parameters on the recovery factors and gas oil ratio in high viscous reservoirs under foamy solution gas drive , 2017 .

[13]  B. Busahmin,et al.  Studies on the Stability of the Foamy Oil in Developing Heavy Oil Reservoirs , 2017 .

[14]  Ch. Venkateswarlu,et al.  Soft sensor based nonlinear control of a chaotic reactor , 2009, ICONS.

[15]  Xuan Wang,et al.  Ensemble based prediction of water levels and residual currents in Singapore regional waters for operational forecasting , 2014, Environ. Model. Softw..

[16]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[17]  Roman Rosipal,et al.  Overview and Recent Advances in Partial Least Squares , 2005, SLSFS.