Co-training partial least squares model for semi-supervised soft sensor development

Abstract Typically, the easy-to-measure variables are used to predict the hard-to-measure ones in soft sensor modeling. In practice, however, the easy-to-measure variables are redundant while the other ones are quite rare, which are often obtained from offline lab analyses. In this paper, the semi-supervised learning method is introduced for soft sensor modeling. Particularly, the co-training strategy is combined with the conventionally used partial least squares model (PLS). A co-training styled algorithm called co-training PLS is proposed for the development of a semi-supervised soft sensor. By splitting the whole process variables into two different parts, two diverse PLS regression models can be developed. Through an iterative learning procedure, the final new labeled data sets can be determined, based on which two new regressors are constructed for soft sensing. Two examples are provided for performance evaluation of the proposed method, with detailed comparative studies to the traditional PLS and co-training k NN model based soft sensors.

[1]  Jie Tang,et al.  Addressing cold start in recommender systems: a semi-supervised co-training algorithm , 2014, SIGIR.

[2]  Sten Bay Jørgensen,et al.  A systematic approach for soft sensor development , 2007, Comput. Chem. Eng..

[3]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[4]  Claire Cardie,et al.  Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.

[5]  Biao Huang,et al.  Design of inferential sensors in the process industry: A review of Bayesian methods , 2013 .

[6]  Sunghee Choi,et al.  Prediction of movement direction in crude oil prices based on semi-supervised learning , 2013, Decis. Support Syst..

[7]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[8]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[9]  Luigi Fortuna,et al.  Soft Sensors for Monitoring and Control of Industrial Processes (Advances in Industrial Control) , 2006 .

[10]  Zhi-Hua Zhou,et al.  Analyzing Co-training Style Algorithms , 2007, ECML.

[11]  Bogdan Gabrys,et al.  Data-driven Soft Sensors in the process industry , 2009, Comput. Chem. Eng..

[12]  Furong Gao,et al.  Review of Recent Research on Data-Based Process Monitoring , 2013 .

[13]  Jialin Liu,et al.  Development of Self-Validating Soft Sensors Using Fast Moving Window Partial Least Squares , 2010 .

[14]  David A. Landgrebe,et al.  The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon , 1994, IEEE Trans. Geosci. Remote. Sens..

[15]  Furong Gao,et al.  Batch process monitoring based on support vector data description method , 2011 .

[16]  Zhi-Hua Zhou,et al.  Exploiting Unlabeled Data in Content-Based Image Retrieval , 2004, ECML.

[17]  Anders Søgaard,et al.  Semi-Supervised Learning and Domain Adaptation in Natural Language Processing , 2013, Semi-Supervised Learning and Domain Adaptation in Natural Language Processing.

[18]  Hiromasa Kaneko,et al.  Adaptive soft sensor based on online support vector regression and Bayesian ensemble learning for various states in chemical plants , 2014 .

[19]  Manabu Kano,et al.  Development of soft-sensor using locally weighted PLS with adaptive similarity measure , 2013 .

[20]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[21]  Yeong-Koo Yeo,et al.  Statistical data modeling based on partial least squares: Application to melt index predictions in high density polyethylene processes to achieve energy-saving operation , 2013, Korean Journal of Chemical Engineering.

[22]  Zhiqiang Ge,et al.  Semisupervised Bayesian method for soft sensor modeling with unlabeled data samples , 2011 .

[23]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[24]  Zhiqiang Ge,et al.  Mixture semisupervised principal component regression model and soft sensor application , 2014 .

[25]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[26]  Zhi-Hua Zhou,et al.  Enhancing relevance feedback in image retrieval using unlabeled data , 2006, ACM Trans. Inf. Syst..

[27]  Mark Steedman,et al.  Bootstrapping statistical parsers from small datasets , 2003, EACL.