Automatic composition and optimisation of multicomponent predictive systems

Composition and parametrisation of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps is a challenging task. This paper is concerned with theoretical considerations and extensive experimental analysis for automating the task of building such predictive systems. In the theoretical part of the paper, we first propose to adopt the Well-handled and Acyclic Workflow (WA-WF) Petri net as a formal representation of MCPSs. We then define the optimisation problem in which the search space consists of suitably parametrised directed acyclic graphs (i.e. WA-WFs) forming the sought MCPS solutions. In the experimental analysis we focus on examining the impact of considerably extending the search space resulting from incorporating multiple sequential data cleaning and preprocessing steps in the process of composing optimised MCPSs, and the quality of the solutions found. In a range of extensive experiments three different optimisation strategies are used to automatically compose MCPSs for 21 publicly available datasets and 7 datasets from real chemical processes. The diversity of the composed MCPSs found is an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. Our findings can have a major impact on development of high quality predictive models as well as their maintenance and scalability aspects needed in modern applications and deployment scenarios.

[1]  Harold J. Kushner,et al.  A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise , 1964 .

[2]  Abraham Bernstein,et al.  A survey of intelligent assistants for data analysis , 2013, CSUR.

[3]  Ting Wang,et al.  A general framework for medical data mining , 2010, 2010 International Conference on Future Information Technology and Management Engineering.

[4]  Tianyou Chai,et al.  Data-Driven Soft-Sensor Modeling for Product Quality Estimation Using Case-Based Reasoning and Fuzzy-Similarity Rough Sets , 2014, IEEE Transactions on Automation Science and Engineering.

[5]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[6]  Frank Hutter,et al.  Using Meta-Learning to Initialize Bayesian Optimization of Hyperparameters , 2014, MetaSel@ECAI.

[7]  Sten Bay Jørgensen,et al.  A systematic approach for soft sensor development , 2007, Comput. Chem. Eng..

[8]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[9]  Volker Märgner,et al.  A design of a preprocessing framework for large database of historical documents , 2011, HIP '11.

[10]  M. Arthur Munson,et al.  A study on the importance of and time spent on different modeling steps , 2012, SKDD.

[11]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[12]  Bogdan Gabrys,et al.  Local learning‐based adaptive soft sensor for catalyst activation prediction , 2011 .

[13]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[14]  Longbing Cao,et al.  Effective detection of sophisticated online banking fraud on extremely imbalanced data , 2012, World Wide Web.

[15]  Wil M. P. van der Aalst,et al.  Process mining: making knowledge discovery process centric , 2012, SKDD.

[16]  Damien Fay,et al.  On sequences of different adaptive mechanisms in non-stationary regression problems , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[17]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Sungyoung Lee,et al.  Accurate multi-criteria decision making methodology for recommending machine learning algorithm , 2017, Expert Syst. Appl..

[19]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[20]  Lars Kotthoff,et al.  Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA , 2017, J. Mach. Learn. Res..

[21]  Kevin Leyton-Brown,et al.  Efficient Benchmarking of Hyperparameter Optimizers via Surrogates , 2015, AAAI.

[22]  Du Ruxu,et al.  An Intelligent Online Monitoring and Diagnostic System for Manufacturing Automation , 2008, IEEE Transactions on Automation Science and Engineering.

[23]  Luigi Fortuna,et al.  Soft sensors for product quality monitoring in debutanizer distillation columns , 2005 .

[24]  Lei Wang,et al.  Radial Basis Function Neural Networks-Based Modeling of the Membrane Separation Process: Hydrogen Recovery from Refinery Gases , 2006 .

[25]  Bogdan Gabrys,et al.  Towards Automatic Composition of Multicomponent Predictive Systems , 2016, HAIS.

[26]  Bogdan Gabrys,et al.  Correntropy-based density-preserving data sampling as an alternative to standard cross-validation , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[27]  Xi Zhang,et al.  An Automatic Process Monitoring Method Using Recurrence Plot in Progressive Stamping Processes , 2016, IEEE Transactions on Automation Science and Engineering.

[28]  Kevin Leyton-Brown,et al.  An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[29]  Wil M. P. van der Aalst,et al.  The Application of Petri Nets to Workflow Management , 1998, J. Circuits Syst. Comput..

[30]  Bogdan Gabrys,et al.  Multicriteria approaches for predictive model generation: A comparative experimental study , 2014, 2014 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making (MCDM).

[31]  Michael J. A. Berry,et al.  Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management , 2004 .

[32]  Bogdan Gabrys,et al.  Data-driven Soft Sensors in the process industry , 2009, Comput. Chem. Eng..

[33]  Thierry J. Chaussalet,et al.  Data preparation for clinical data mining to identify patients at risk of readmission , 2010, 2010 IEEE 23rd International Symposium on Computer-Based Medical Systems (CBMS).

[34]  Olli Simula,et al.  Process Monitoring and Modeling Using the Self-Organizing Map , 1999, Integr. Comput. Aided Eng..

[35]  Athanasios Tsakonas,et al.  GRADIENT: Grammar-driven genetic programming framework for building multi-component, hierarchical predictive systems , 2012, Expert Syst. Appl..

[36]  Bogdan Gabrys,et al.  Meta-learning for time series forecasting and forecast combination , 2010, Neurocomputing.

[37]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.