Distributed parallel deep learning of Hierarchical Extreme Learning Machine for multimode quality prediction with big process data

Abstract In this work, the distributed and parallel Extreme Learning Machine (dp-ELM) and Hierarchical Extreme Learning Machine (dp-HELM) are proposed for multimode process quality prediction with big data. The efficient ELM algorithm is transformed into the distributed and parallel modeling form according to the MapReduce framework. Since the deep learning network structure of HELM is more accurate than the single layer of ELM in feature representation, the dp-HELM is further developed through decomposing the ELM-based Auto-encoders (ELM-AE) of deep hidden layers into a loop of MapReduce jobs. Additionally, the multimode issue is solved through the “divide and rule” strategy. The distributed and parallel K-means (dp-K-means) is utilized to divide the process modes, which are further trained in a synchronous parallel way by dp-ELM and dp-HELM. Finally, the Bayesian model fusion technique is utilized to integrate the local models for online prediction. The proposed algorithms are deployed on a Hadoop MapReduce computing cluster and the feasibility and efficiency are illustrated through building a real industrial quality prediction model with big process data.

[1]  David Shan-Hill Wong,et al.  Design and Application of a Variable Selection Method for Multilayer Perceptron Neural Network With LASSO , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Maozhen Li,et al.  A MapReduce-based distributed SVM algorithm for automatic image annotation , 2011, Comput. Math. Appl..

[3]  Zhiqiang Ge,et al.  Big data quality prediction in the process industry: A distributed parallel modeling framework , 2018, Journal of Process Control.

[4]  Vipin Kumar,et al.  Trends in big data analytics , 2014, J. Parallel Distributed Comput..

[5]  Manuel Mucientes,et al.  STAC: A web platform for the comparison of algorithms using statistical tests , 2015, 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[6]  Dexian Huang,et al.  Data-driven soft sensor development based on deep learning technique , 2014 .

[7]  Christine W. Chan,et al.  Artificial intelligence for monitoring and supervisory control of process systems , 2007, Eng. Appl. Artif. Intell..

[8]  Dewei Li,et al.  Monitoring big process data of industrial plants with multiple operating modes based on Hadoop , 2018, Journal of the Taiwan Institute of Chemical Engineers.

[9]  Zhiqiang Ge,et al.  Review on data-driven modeling and monitoring for plant-wide industrial processes , 2017 .

[10]  Koichi Fujiwara,et al.  Development of correlation-based pattern recognition algorithm and adaptive soft-sensor design , 2012 .

[11]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[12]  Biao Huang,et al.  A decoupled multiple model approach for soft sensors design , 2011 .

[13]  Zhiqiang Ge,et al.  Distributed predictive modeling framework for prediction and diagnosis of key performance index in plant-wide processes , 2017 .

[14]  Guang-Bin Huang,et al.  Extreme Learning Machine for Multilayer Perceptron , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Zhiqiang Ge,et al.  Process Data Analytics via Probabilistic Latent Variable Models: A Tutorial Review , 2018, Industrial & Engineering Chemistry Research.

[16]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[17]  Dexian Huang,et al.  Novel Bayesian Framework for Dynamic Soft Sensor Based on Support Vector Machine With Finite Impulse Response , 2014, IEEE Transactions on Control Systems Technology.

[18]  Zhiqiang Ge,et al.  Data Mining and Analytics in the Process Industry: The Role of Machine Learning , 2017, IEEE Access.

[19]  Okyay Kaynak,et al.  Big Data for Modern Industry: Challenges and Trends [Point of View] , 2015, Proc. IEEE.

[20]  S. Joe Qin,et al.  Process data analytics in the era of big data , 2014 .

[21]  Zhiqiang Ge,et al.  Probabilistic Sequential Network for Deep Learning of Complex Process Data and Soft Sensor Application , 2019, IEEE Transactions on Industrial Informatics.

[22]  Kalipatnapu Yamuna Rani,et al.  Modeling of Batch Processes Using Explicitly Time-Dependent Artificial Neural Networks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[24]  Jim Austin,et al.  Hadoop neural network for parallel and distributed feature selection , 2016, Neural Networks.

[25]  Zhiqiang Ge,et al.  Online Updating Soft Sensor Modeling and Industrial Application Based on Selectively Integrated Moving Window Approach , 2017, IEEE Transactions on Instrumentation and Measurement.

[26]  Zhiqiang Ge,et al.  Dynamic Probabilistic Latent Variable Model for Process Data Modeling and Regression Application , 2019, IEEE Transactions on Control Systems Technology.

[27]  Zhiqiang Ge,et al.  Distributed Parallel PCA for Modeling and Monitoring of Large-Scale Plant-Wide Processes With Big Data , 2017, IEEE Transactions on Industrial Informatics.

[28]  Zhiqiang Ge,et al.  Review and big data perspectives on robust data mining approaches for industrial process modeling with outliers and missing data , 2018, Annu. Rev. Control..

[29]  Zhiqiang Ge,et al.  Moving window adaptive soft sensor for state shifting process based on weighted supervised latent factor analysis , 2017 .

[30]  Weiming Shao,et al.  Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models , 2015 .

[31]  Vijander Singh,et al.  Development of soft sensor for neural network based control of distillation column. , 2013, ISA transactions.

[32]  Di Tang,et al.  A Data-Driven Soft Sensor Modeling Method Based on Deep Learning and its Application , 2017, IEEE Transactions on Industrial Electronics.

[33]  Jie Yu,et al.  A Bayesian inference based two-stage support vector regression framework for soft sensor development in batch bioprocesses , 2012, Comput. Chem. Eng..

[34]  Zhiqiang Ge,et al.  Scalable Semisupervised GMM for Big Data Quality Prediction in Multimode Processes , 2019, IEEE Transactions on Industrial Electronics.

[35]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[36]  Yan-Lin He,et al.  Data driven soft sensor development for complex chemical processes using extreme learning machine , 2015 .

[37]  Miltiadis Alamaniotis,et al.  Probabilistic kernel machines for predictive monitoring of weld residual stress in energy systems , 2018, Eng. Appl. Artif. Intell..

[38]  Stefano Marsili-Libelli,et al.  Adaptive data-derived anomaly detection in the activated sludge process of a large-scale wastewater treatment plant , 2016, Eng. Appl. Artif. Intell..

[39]  Zhong Liu,et al.  Distributed Modeling in a MapReduce Framework for Data-Driven Traffic Flow Forecasting , 2013, IEEE Transactions on Intelligent Transportation Systems.

[40]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[41]  Jie Chen,et al.  A new Self-Organizing Extreme Learning Machine soft sensor model and its applications in complicated chemical processes , 2017, Eng. Appl. Artif. Intell..