An intelligent hybrid feature subset selection and production pattern recognition method for modeling steam cracking process

Abstract A data-driven model framework integrating Feature Subset Selection (FSS), production pattern clustering analysis and prediction was proposed for predicting ethylene yield of ethylene plant by using the massive sensing data recorded by the Distributed Control System (DCS) of petrochemical enterprises. Firstly, an Ensemble-Filter FSS model based on three different metrics is designed to initially filter all the steam cracking furnace features, and then a Wrapper FSS model based on GA-SVR is used to obtain the optimal subset of features affecting ethylene yield. The steam cracking furnace was identified based on the Density Peak Clustering (DPC) algorithm based on the production patterns embedded in the data. Ethylene yield prediction models were separately developed for each production pattern to summarize the final prediction results. The proposed model was validated against historical data from an industrial steam cracking furnace in northwest China. Results show that the number of features have a 93.4% reduction in the FSS stage. and a 40.6% reduction in predicted MSE. Compared with the benchmark ANN model, the proposed DPNN model has a 56.6% reduction in MSE based on the optimal cluster result. What’s more, the proposed framework has a strong generalization ability and with a modular structure which is easy to modify., which is expected to be used to guide the ethylene plant operating in reasonable intervals.

[1]  R. G. Moore,et al.  Kinetic modeling of thermal cracking reactions , 2009 .

[2]  José Fco. Martínez-Trinidad,et al.  A Supervised Filter Feature Selection method for mixed data based on Spectral Feature Selection and Information-theory redundancy analysis , 2020, Pattern Recognit. Lett..

[3]  Richard Nock,et al.  A hybrid filter/wrapper approach of feature selection using information theory , 2002, Pattern Recognit..

[4]  Hui-Huang Hsu,et al.  Hybrid feature selection by combining filters and wrappers , 2011, Expert Syst. Appl..

[5]  Shanhai Ge,et al.  Characteristics of subzero startup and water/ice formation on the catalyst layer in a polymer electrolyte fuel cell , 2007 .

[6]  Sharifah Rafidah Wan Alwi,et al.  A generic hybrid model development for process analysis of industrial fixed-bed catalytic reactors , 2017 .

[7]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[8]  S. Joe Qin,et al.  Process data analytics in the era of big data , 2014 .

[9]  Lening Li,et al.  Identification of abnormal conditions in high-dimensional chemical process based on feature selection and deep learning , 2020 .

[10]  J. Rodgers,et al.  Thirteen ways to look at the correlation coefficient , 1988 .

[11]  Zhiqiang Geng,et al.  Performance analysis and optimal temperature selection of ethylene cracking furnaces: A data envelopment analysis cross-model integrated analytic hierarchy process , 2016 .

[12]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[13]  Yiqun Zhong,et al.  Study on power consumption load forecast based on K-means clustering and FCM–BP model , 2020 .

[14]  Bin Yu,et al.  Energy optimization and prediction modeling of petrochemical industries: An improved convolutional neural network based on cross-feature , 2020 .

[15]  Sunwon Park,et al.  Modeling of industrial naphtha cracking furnaces , 2001 .

[16]  Guy Marin,et al.  Challenges of Modeling Steam Cracking of Heavy Feedstocks , 2007 .

[17]  Zhou Fang,et al.  Application of convolutional neural networks to large-scale naphtha pyrolysis kinetic modeling , 2018, Chinese Journal of Chemical Engineering.

[18]  Enrique Alba,et al.  Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments , 2016, Appl. Soft Comput..

[19]  Jesús Ariel Carrasco-Ochoa,et al.  A new hybrid filter-wrapper feature selection method for clustering based on ranking , 2016, Neurocomputing.

[20]  Sebastián Ventura,et al.  Scalable extensions of the ReliefF algorithm for weighting and selecting features on the multi-label learning context , 2015, Neurocomputing.

[21]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[22]  G. Marin,et al.  Taking optimal advantage of feedstock flexibility with Coilsim1D , 2008 .

[23]  Oliver J.. Fisher,et al.  Considerations, challenges and opportunities when developing data-driven models for process manufacturing systems , 2020, Comput. Chem. Eng..

[24]  Yongjian Wang,et al.  An Improved Bar-Shaped Sliding Window CNN Tailored to Industrial Process Historical Data with Applications in Chemical Operational Optimizations , 2019, Industrial & Engineering Chemistry Research.

[25]  Hui Liu,et al.  A New Model Using Multiple Feature Clustering and Neural Networks for Forecasting Hourly PM2.5 Concentrations, and Its Applications in China , 2020 .

[26]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[27]  Randal S. Olson,et al.  Relief-Based Feature Selection: Introduction and Review , 2017, J. Biomed. Informatics.

[28]  Jafar Towfighi,et al.  Modeling of Thermal Cracking of Heavy Liquid Hydrocarbon: Application of Kinetic Modeling, Artificial Neural Network, and Neuro-Fuzzy Models , 2011 .

[29]  V. Venkatasubramanian The promise of artificial intelligence in chemical engineering: Is it here, finally? , 2018, AIChE Journal.

[30]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[31]  K. Pearson NOTES ON THE HISTORY OF CORRELATION , 1920 .

[32]  Nils van Velzen,et al.  Equation-based SPYRO® model and solver for the simulation of the steam cracking process , 2001 .

[33]  Ewa Szymańska,et al.  Modern data science for analytical chemical data - A comprehensive review. , 2018, Analytica chimica acta.

[34]  Zehong Yang,et al.  A novel hybrid feature selection algorithm: using ReliefF estimation for GA-Wrapper search , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[35]  Shulin Wang,et al.  Feature selection in machine learning: A new perspective , 2018, Neurocomputing.

[36]  Alessandro Laio,et al.  Clustering by fast search and find of density peaks , 2014, Science.

[37]  Feng Qian,et al.  Modeling the Hydrocracking Process with Deep Neural Networks , 2020 .

[38]  Hamed H. H. Aly,et al.  A proposed intelligent short-term load forecasting hybrid models of ANN, WNN and KF based on clustering techniques for smart grid , 2020 .

[39]  Dimitrios Tzovaras,et al.  Towards the behavior analysis of chemical reactors utilizing data-driven trend analysis and machine learning techniques , 2020, Appl. Soft Comput..

[40]  Mohammad Fakhroleslam,et al.  Thermal/catalytic cracking of hydrocarbons for the production of olefins; a state-of-the-art review III: Process modeling and simulation , 2019, Fuel.

[41]  Jinsong Zhao,et al.  Smart Manufacturing for the Oil Refining and Petrochemical Industry , 2017 .

[42]  Xiong Zhihua Soft-sensor of product yields in ethylene pyrolysis based on support vector regression , 2010 .

[43]  Hao Wu,et al.  Deep convolutional neural network model based chemical process fault diagnosis , 2018, Comput. Chem. Eng..

[44]  Yi-Ming Wei,et al.  Energy technology roadmap for ethylene industry in China , 2018, Applied Energy.

[45]  S. M. Sadrameli Thermal/catalytic cracking of hydrocarbons for the production of olefins: A state-of-the-art review I: Thermal cracking review , 2015 .

[46]  Christian V. Stevens,et al.  Artificial Intelligence in Steam Cracking Modeling: A Deep Learning Algorithm for Detailed Effluent Prediction , 2019 .

[47]  Jafar Towfighi,et al.  Modeling of thermal cracking of LPG: Application of artificial neural network in prediction of the main product yields , 2007 .

[48]  D. Kunzru,et al.  Modeling of naphtha pyrolysis , 1985 .

[49]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[50]  Xin Wang,et al.  Multiobjective optimization of ethylene cracking furnace system using self-adaptive multiobjective teaching-learning-based optimization , 2018 .

[51]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[52]  J. Nathan Kutz,et al.  Data-driven modeling and learning in science and engineering , 2019, Comptes Rendus Mécanique.

[53]  Kevin Van Geem,et al.  New Trends in Olefin Production , 2017 .

[54]  Jafar Towfighi,et al.  Genetic algorithm model development for prediction of main products in thermal cracking of naphtha: Comparison with kinetic modeling , 2012 .

[55]  Alex E. S. Green,et al.  Systematics and modeling representations of naphtha thermal cracking for olefin production , 2005 .