Novel manifold learning based virtual sample generation for optimizing soft sensor with small data.

Due to the extremely complex mechanism and strong non-linear characteristics of industrial processes, data-driven soft sensor technologies play a key role in the intelligent measurement of process industries. However, the information of the collected process data in the steady stage is quite limited and unreliable, causing the small sample problem. As a result, it becomes an intractable challenge to catch the nature of the process and build accurate soft sensor models. To solve this problem, this paper proposes a novel manifold learning based virtual sample generation method (Isomap-VSG) to generate feasible virtual samples in the information gaps for supplementing the original small sample space. To find data sparse regions reasonably, one kind of manifold learning methods called Isomap is used to visualize process data with high dimension. Then virtual samples can be generated by the interpolation method and extreme learning machine. The simulation results on a standard dataset and a real-world application demonstrate that, compared with other advanced methods, the proposed Isomap-VSG method can achieve better performance in terms of generating feasible virtual samples and improving the accuracy of soft sensor models using limited samples.

[1]  Zhu Bao,et al.  A novel mega-trend-diffusion for small sample , 2016 .

[2]  Han-Xiong Li,et al.  ISOMAP-Based Spatiotemporal Modeling for Lithium-Ion Battery Thermal Process , 2018, IEEE Transactions on Industrial Informatics.

[3]  Antonio J. Plaza,et al.  A New Spatial–Spectral Feature Extraction Method for Hyperspectral Images Using Local Covariance Matrix Representation , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Jianpei Zhang,et al.  A novel virtual sample generation method based on Gaussian distribution , 2011, Knowl. Based Syst..

[5]  Mechanism modeling and validation in ultrasonic vibration assisted drilling with variable cross section drilling tool of brittle materials , 2019, The International Journal of Advanced Manufacturing Technology.

[6]  Ratna Babu Chinnam,et al.  Observational data-driven modeling and optimization of manufacturing processes , 2017, Expert Syst. Appl..

[7]  Der-Chiang Li,et al.  Rebuilding sample distributions for small dataset learning , 2018, Decis. Support Syst..

[8]  H Surendra,et al.  A Review Of Synthetic Data Generation Methods For Privacy Preserving Data Publishing , 2017 .

[9]  Yan-Lin He,et al.  A novel and effective nonlinear interpolation virtual sample generation method for enhancing energy prediction and analysis on small data problem: A case study of Ethylene industry , 2018 .

[10]  Sanjay Kumar Singh,et al.  Application of Spectral Kurtosis and Improved Extreme Learning Machine for Bearing Fault Classification , 2019, IEEE Transactions on Instrumentation and Measurement.

[11]  Lin Li,et al.  Nonlinear Dynamic Soft Sensor Modeling With Supervised Long Short-Term Memory Network , 2020, IEEE Transactions on Industrial Informatics.

[12]  Yan-Lin He,et al.  Energy modeling using an effective latent variable based functional link learning machine , 2018 .

[13]  Chao Zhang,et al.  Multiple sources and multiple measures based traffic flow prediction using the chaos theory and support vector regression method , 2017 .

[14]  Yisheng Lv,et al.  Data driven parallel prediction of building energy consumption using generative adversarial nets , 2019, Energy and Buildings.

[15]  Soledad Espezua,et al.  A Projection Pursuit framework for supervised dimension reduction of high dimensional small sample datasets , 2015, Neurocomputing.

[16]  Yan-Lin He,et al.  A Monte Carlo and PSO based virtual sample generation method for enhancing the energy prediction and energy optimization on small data problem: An empirical study of petrochemical industries , 2017 .

[17]  Yun Zhang,et al.  A SVM controller for the stable walking of biped robots based on small sample sizes , 2016, Appl. Soft Comput..

[18]  Gang Xu,et al.  A simulated parameter optimization method–based manifold learning for a production process , 2019, Concurr. Comput. Pract. Exp..

[19]  Der-Chiang Li,et al.  A tree-based-trend-diffusion prediction procedure for small sample sets in the early stages of manufacturing systems , 2012, Expert Syst. Appl..

[20]  M. Becoulet,et al.  Non-linear modeling of the threshold between ELM mitigation and ELM suppression by resonant magnetic perturbations in ASDEX upgrade , 2019, Physics of Plasmas.

[21]  Qunxiong Zhu,et al.  Dealing with small sample size problems in process industry using virtual sample generation: a Kriging-based approach , 2020, Soft Comput..

[22]  Yuan Xu,et al.  Fault diagnosis using novel AdaBoost based discriminant locality preserving projection with resamples , 2020, Eng. Appl. Artif. Intell..

[23]  Wei Shang,et al.  A small-sample hybrid model for forecasting energy-related CO2 emissions , 2014 .

[24]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[25]  Yuan Xu,et al.  Novel soft sensor development using echo state network integrated with singular value decomposition: Application to complex chemical processes , 2020 .

[26]  Xin Liu,et al.  Wasserstein GAN-Based Small-Sample Augmentation for New-Generation Artificial Intelligence: A Case Study of Cancer-Staging Data in Biology , 2019, Engineering.

[27]  Natasha A. Khovanova,et al.  Handling limited datasets with neural networks in medical applications: A small-data approach , 2017, Artif. Intell. Medicine.

[28]  Yang Zhang,et al.  Novel chaotic bat algorithm for forecasting complex motion of floating platforms , 2019, Applied Mathematical Modelling.

[29]  Prashant Mhaskar,et al.  Data-Driven Modeling and Quality Control of Variable Duration Batch Processes with Discrete Inputs , 2017 .

[30]  Yanlin He,et al.  A PSO based virtual sample generation method for small sample sets: Applications to regression datasets , 2017, Eng. Appl. Artif. Intell..

[31]  Bogdan Gabrys,et al.  Data-driven Soft Sensors in the process industry , 2009, Comput. Chem. Eng..

[32]  Jun Wang,et al.  Energy and Production Efficiency Optimization of an Ethylene Plant Considering Process Operation and Structure , 2020 .

[33]  Gheorghe Bota,et al.  High-Temperature Corrosion by Carboxylic Acids and Sulfidation under Refinery Conditions—Mechanism, Model, and Simulation , 2018 .

[34]  Der-Chiang Li,et al.  Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency , 2014, Decis. Support Syst..