Review of soft sensor methods for regression applications

Abstract Soft sensors for regression applications (SSR) are inferential models that use online available sensors (e.g. temperature, pressure, flow rate, etc.) to predict quality variables which cannot be automatically measured at all, or can only be measured at high cost, sporadically, or with high delays (e.g. laboratory analysis). SSR are built using historical data of the process, usually provided from the supervisory control and data acquisition (SCADA) system or obtained from laboratory annotations/measurements. In the SSR development, there are many issues to deal with. The main issues are the treatment of missing data, outlier detection, selection of input variables, model training, validation, and SSR maintenance. In this work, a literature review on each of these topics will be performed, reviewing the most important works in these areas. Emphasis will be given to the methods and not to the applications.

[1]  G. David Garson,et al.  Interpreting neural-network connection weights , 1991 .

[2]  E. Mizutani,et al.  Neuro-Fuzzy and Soft Computing-A Computational Approach to Learning and Machine Intelligence [Book Review] , 1997, IEEE Transactions on Automatic Control.

[3]  S.Joe Qin,et al.  Neural Networks for Intelligent Sensors and Control — Practical Issues and Some Solutions , 1997 .

[4]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[5]  Furong Gao,et al.  Multirate dynamic inferential modeling for multivariable processes , 2004 .

[6]  Rui Araújo,et al.  Predicting gas emissions in a cement kiln plant using hard and soft modeling strategies , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[7]  O. Nelles Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models , 2000 .

[8]  M. Gevrey,et al.  Review and comparison of methods to study the contribution of variables in artificial neural network models , 2003 .

[9]  Sten Bay Jørgensen,et al.  A systematic approach for soft sensor development , 2007, Comput. Chem. Eng..

[10]  Vir V. Phoha,et al.  On the Feature Selection Criterion Based on an Approximation of Multidimensional Mutual Information , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[12]  Josep M. Sopena,et al.  Performing Feature Selection With Multilayer Perceptrons , 2008, IEEE Transactions on Neural Networks.

[13]  Irad Ben-Gal Outlier Detection , 2005, The Data Mining and Knowledge Discovery Handbook.

[14]  Pedro Santos,et al.  Variable and delay selection using neural networks and mutual information for data-driven soft sensors , 2010, 2010 IEEE 15th Conference on Emerging Technologies & Factory Automation (ETFA 2010).

[15]  Lúcia Valéria Ramos de Arruda,et al.  A neuro-coevolutionary genetic fuzzy system to design soft sensors , 2008, Soft Comput..

[16]  D J Choi,et al.  A hybrid artificial neural network as a software sensor for optimal control of a wastewater treatment process. , 2001, Water research.

[17]  Raphaël Féraud,et al.  Driven Forward Features Selection: A Comparative Study on Neural Networks , 2006, ICONIP.

[18]  Yan Li,et al.  Estimation of Mutual Information: A Survey , 2009, RSKT.

[19]  Kimito Funatsu,et al.  Genetic algorithm‐based wavelength selection method for spectral calibration , 2011 .

[20]  Wei Jiang,et al.  On-line outlier detection and data cleaning , 2004, Comput. Chem. Eng..

[21]  Alexey Tsymbal,et al.  The problem of concept drift: definitions and related work , 2004 .

[22]  Rui Araújo,et al.  Mixture of partial least squares experts and application in prediction settings with multiple operating modes , 2014 .

[23]  Craig K. Enders,et al.  A Primer on Maximum Likelihood Algorithms Available for Use With Missing Data , 2001 .

[24]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[25]  Rui Araújo,et al.  Online Mixture of Univariate Linear Regression Models for Adaptive Soft Sensors , 2014, IEEE Transactions on Industrial Informatics.

[26]  J. R. Whiteley,et al.  Development of inferential measurements using neural networks. , 2001, ISA transactions.

[27]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[29]  H. Hyötyniemi,et al.  Recursive multimodel partial least squares estimation of mineral flotation slurry contents using optical reflectance spectra. , 2009, Analytica chimica acta.

[30]  Timo Similä,et al.  Combined input variable selection and model complexity control for nonlinear regression , 2009, Pattern Recognit. Lett..

[31]  Mark Matzopoulos Dynamic Process Modeling: Combining Models and Experimental Data to Solve Industrial Problems , 2011 .

[32]  Rui Araújo,et al.  A multilayer-perceptron based method for variable selection in soft sensor design , 2013 .

[33]  David Shan-Hill Wong,et al.  Development of Adaptive Soft Sensor Based on Statistical Identification of Key Variables , 2008 .

[34]  S. Graziani,et al.  A Comparative Analysis of the Influence of Methods for Outliers Detection on the Performance of Data Driven Models , 2007, 2007 IEEE Instrumentation & Measurement Technology Conference IMTC 2007.

[35]  Chonghun Han,et al.  Improved Quality Estimation and Knowledge Extraction in a Batch Process by Bootstrapping-Based Generalized Variable Selection , 2004 .

[36]  Furong Gao,et al.  Stage-based process analysis and quality prediction for batch processes , 2005 .

[37]  Peyman Eshghi,et al.  Dimensionality choice in principal components analysis via cross-validatory methods , 2014 .

[38]  Jun Wang,et al.  Applying input variables selection technique on input weighted support vector machine modeling for BOF endpoint prediction , 2010, Eng. Appl. Artif. Intell..

[39]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[40]  Hiromasa Kaneko,et al.  Adaptive soft sensor based on online support vector regression and Bayesian ensemble learning for various states in chemical plants , 2014 .

[41]  C. L. Mallows Some comments on C_p , 1973 .

[42]  Mika Liukkonen,et al.  Adaptive soft sensor for fluidized bed quality: Applications to combustion of biomass , 2013 .

[43]  Jian Chu,et al.  Adaptive Soft-sensor Modeling Algorithm Based on FCMISVM and Its Application in PX Adsorption Separation Process , 2008 .

[44]  Leonardo Franco,et al.  Missing data imputation using statistical and machine learning methods in a real breast cancer problem , 2010, Artif. Intell. Medicine.

[45]  David G. Stork,et al.  Pattern Classification , 1973 .

[46]  Plamen P. Angelov,et al.  Adaptive Inferential Sensors Based on Evolving Fuzzy Models , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[47]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[48]  Rui Araújo,et al.  Design and application of Soft Sensor using Ensemble Methods , 2011, ETFA2011.

[49]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[50]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[51]  I. Dimopoulos,et al.  Neural network models to study relationships between lead concentration in grasses and permanent urban descriptors in Athens city (Greece) , 1999 .

[52]  Eduardo F. Camacho,et al.  Model predictive control techniques for hybrid systems , 2010, Annu. Rev. Control..

[53]  Jian-Bo Yang,et al.  Feature Selection for MLP Neural Network: The Use of Random Permutation of Probabilistic Outputs , 2009, IEEE Transactions on Neural Networks.

[54]  Fuli Wang,et al.  Process monitoring based on mode identification for multi-mode process with transitions , 2012 .

[55]  Mohammad Teshnehlab,et al.  Training ANFIS as an identifier with intelligent hybrid stable learning algorithm based on particle swarm optimization and extended Kalman filter , 2009, Fuzzy Sets Syst..

[56]  Biao Huang,et al.  FIR model identification of multirate processes with random delays using EM algorithm , 2013 .

[57]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[58]  Jacek M. Zurada,et al.  Normalized Mutual Information Feature Selection , 2009, IEEE Transactions on Neural Networks.

[59]  Tiina M. Komulainen,et al.  An online application of dynamic PLS to a dearomatization process , 2004, Comput. Chem. Eng..

[60]  Bogdan Gabrys,et al.  Review of adaptation mechanisms for data-driven soft sensors , 2011, Comput. Chem. Eng..

[61]  Bogdan Gabrys,et al.  Local learning‐based adaptive soft sensor for catalyst activation prediction , 2011 .

[62]  Bhupinder S. Dayal,et al.  Recursive exponentially weighted PLS and its applications to adaptive control and prediction , 1997 .

[63]  Jesús Picó,et al.  Online monitoring of batch processes using multi-phase principal component analysis , 2006 .

[64]  Luigi Fortuna,et al.  Comparison of Soft-Sensor Design Methods for Industrial Plants Using Small Data Sets , 2009, IEEE Transactions on Instrumentation and Measurement.

[65]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[66]  Luiz Augusto da Cruz Meleiro,et al.  ANN-based soft-sensor for real-time process monitoring and control of an industrial polymerization process , 2009, Comput. Chem. Eng..

[67]  Chih-Jen Lin,et al.  Simple Probabilistic Predictions for Support Vector Regression , 2004 .

[68]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[69]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[70]  Michel Verleysen,et al.  Mutual information for the selection of relevant variables in spectrometric nonlinear modelling , 2006, ArXiv.

[71]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[72]  K. Helland,et al.  Recursive algorithm for partial least squares regression , 1992 .

[73]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[74]  Bogdan Gabrys,et al.  Data-driven Soft Sensors in the process industry , 2009, Comput. Chem. Eng..

[75]  Rui Araújo,et al.  Variable and time-lag selection using empirical data , 2011, ETFA2011.

[76]  Luigi Fortuna,et al.  Soft Sensors for Monitoring and Control of Industrial Processes (Advances in Industrial Control) , 2006 .

[77]  J. Friedman,et al.  A Statistical View of Some Chemometrics Regression Tools , 1993 .

[78]  Giovanna Castellano,et al.  Variable selection using neural-network models , 2000, Neurocomputing.

[79]  Thomas Marill,et al.  On the effectiveness of receptors in recognition systems , 1963, IEEE Trans. Inf. Theory.

[80]  E. M. Wright,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[81]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[82]  Xionglin Luo,et al.  A novel calibration approach of soft sensor based on multirate data fusion technology , 2010 .

[83]  Plamen P. Angelov,et al.  Soft sensor for predicting crude oil distillation side streams using evolving takagi-sugeno fuzzy models , 2007, 2007 IEEE International Conference on Systems, Man and Cybernetics.

[84]  M. J. Usher Applications of Information Theory , 1984 .

[85]  Lennart Ljung,et al.  Perspectives on system identification , 2010, Annu. Rev. Control..

[86]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[87]  Jie Yu,et al.  Online quality prediction of nonlinear and non-Gaussian chemical processes with shifting dynamics using finite mixture model based Gaussian process regression approach , 2012 .

[88]  Pierantonio Facco,et al.  Nearest-Neighbor Method for the Automatic Maintenance of Multivariate Statistical Soft Sensors in Batch Processing , 2010 .

[89]  Ronald K. Pearson,et al.  Outliers in process modeling and identification , 2002, IEEE Trans. Control. Syst. Technol..

[90]  Stephen A. Billings,et al.  Properties of neural networks with applications to modelling non-linear dynamical systems , 1992 .

[91]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[92]  Dae Sung Lee,et al.  Application of a Moving-Window-Adaptive Neural Network to the Modeling of a Full-Scale Anaerobic Filter Process , 2005 .

[93]  Hare Krishna Mohanta,et al.  A Survey of Data Treatment Techniques for Soft Sensor Design , 2011 .

[94]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[95]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[96]  Snehamoy Chatterjee,et al.  Genetic algorithms for feature selection of image analysis-based quality monitoring model: An application to an iron mine , 2011, Eng. Appl. Artif. Intell..

[97]  Svante Wold,et al.  Chemometrics; what do we mean with it, and what do we want from it? , 1995 .

[98]  Kay I Penny,et al.  A comparison of multivariate outlier detection methods for clinical laboratory safety data , 2001 .

[99]  Jie Zhang,et al.  A recursive nonlinear PLS algorithm for adaptive nonlinear process modeling , 2005 .

[100]  H. Akaike A new look at the statistical model identification , 1974 .

[101]  Girijesh Prasad,et al.  Statistical and computational intelligence techniques for inferential model development: a comparative evaluation and a novel proposition for fusion , 2004, Eng. Appl. Artif. Intell..

[102]  Rui Araújo,et al.  Evolutionary fuzzy models for nonlinear identification , 2012, Proceedings of 2012 IEEE 17th International Conference on Emerging Technologies & Factory Automation (ETFA 2012).

[103]  Rui Araújo,et al.  Genetic fuzzy system for data-driven soft sensors design , 2012, Appl. Soft Comput..

[104]  Dražen Slišković,et al.  Adaptive soft sensor for online prediction and process monitoring based on a mixture of Gaussian process models , 2013, Comput. Chem. Eng..

[105]  Carlos Henggeler Antunes,et al.  Comparison of a genetic algorithm and simulated annealing for automatic neural network ensemble development , 2013, Neurocomputing.

[106]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[107]  K. Fujiwara,et al.  Input variable selection for PLS modeling using nearest correlation spectral clustering , 2012 .

[108]  Hiromasa Kaneko,et al.  Nonlinear regression method with variable region selection and application to soft sensors , 2013 .

[109]  Alain Rakotomamonjy,et al.  Analysis of SVM regression bounds for variable ranking , 2007, Neurocomputing.

[110]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[111]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[112]  Žliobait . e,et al.  Learning under Concept Drift: an Overview , 2010 .

[113]  Ping Wu,et al.  Online dual updating with recursive PLS model and its application in predicting crystal size of purified terephthalic acid (PTA) process , 2006 .

[114]  Herman Augusto Lepikson,et al.  Applications of information theory, genetic algorithms, and neural models to predict oil flow , 2009 .

[115]  Michel Verleysen,et al.  Resampling methods for parameter-free and robust feature selection with mutual information , 2007, Neurocomputing.

[116]  Nicolas Chapados,et al.  Input decay: simple and effective soft variable selection , 2001, IJCNN'01. International Joint Conference on Neural Networks. Proceedings (Cat. No.01CH37222).

[117]  Ping Li,et al.  Kernel classifier with adaptive structure and fixed memory for process diagnosis , 2006 .

[118]  Benoît Frénay,et al.  Is mutual information adequate for feature selection in regression? , 2013, Neural Networks.

[119]  Hiromasa Kaneko,et al.  A new process variable and dynamics selection method based on a genetic algorithm‐based wavelength selection method , 2012 .

[120]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[121]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[122]  Riccardo Muradore,et al.  A PLS-Based Statistical Approach for Fault Detection and Isolation of Robotic Manipulators , 2012, IEEE Transactions on Industrial Electronics.

[123]  Roderick J A Little,et al.  A Review of Hot Deck Imputation for Survey Non‐response , 2010, International statistical review = Revue internationale de statistique.

[124]  Paul E. Green,et al.  AN ALTERNATING LEAST‐SQUARES PROCEDURE FOR ESTIMATING MISSING PREFERENCE DATA IN PRODUCT‐CONCEPT TESTING* , 1986 .

[125]  Dale E. Seborg,et al.  Optimal selection of soft sensor inputs for batch distillation columns using principal component analysis , 2005 .

[126]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[127]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[128]  Theodore B. Trafalis,et al.  Missing Data Imputation Through Machine Learning Algorithms , 2009 .

[129]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[130]  Ludmila I. Kuncheva,et al.  On the window size for classification in changing environments , 2009, Intell. Data Anal..

[131]  Laurie Davies,et al.  The identification of multiple outliers , 1993 .

[132]  I-Cheng Yeh,et al.  First and second order sensitivity analysis of MLP , 2010, Neurocomputing.

[133]  Holger R. Maier,et al.  Review of Input Variable Selection Methods for Artificial Neural Networks , 2011 .

[134]  Bao-Gang Hu,et al.  Two-Phase Construction of Multilayer Perceptrons Using Information Theory , 2009, IEEE Transactions on Neural Networks.

[135]  Guohai Liu,et al.  Model optimization of SVM for a fermentation soft sensor , 2010, Expert Syst. Appl..

[136]  David Shan-Hill Wong,et al.  Development of adaptive soft sensor based on statistical identification of key variables , 2008 .

[137]  Yannis Dimopoulos,et al.  Use of some sensitivity criteria for choosing networks with good generalization ability , 1995, Neural Processing Letters.

[138]  Jian-Bo Yang,et al.  Feature Selection Using Probabilistic Prediction of Support Vector Regression , 2011, IEEE Transactions on Neural Networks.

[139]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[140]  Ali Elkamel,et al.  Hybrid artificial neural network—First principle model formulation for the unsteady state simulation and analysis of a packed bed reactor for CO2 hydrogenation to methanol , 2005 .

[141]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machines , 2002 .

[142]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[143]  Pierantonio Facco,et al.  Moving average PLS soft sensor for online product quality estimation in an industrial batch polymerization process , 2009 .

[144]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[145]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.