Automated Machine Learning: The New Wave of Machine Learning

With the explosion in the use of machine learning in various domains, the need for an efficient pipeline for the development of machine learning models has never been more critical. However, the task of forming and training models largely remains traditional with a dependency on domain experts and time-consuming data manipulation operations, which impedes the development of machine learning models in both academia as well as industry. This demand advocates the new research era concerned with fitting machine learning models fully automatically i.e., AutoML. Automated Machine Learning(AutoML) is an end-to-end process that aims at automating this model development pipeline without any external assistance. First, we provide an insights of AutoML. Second, we delve into the individual segments in the AutoML pipeline and cover their approaches in brief. We also provide a case study on the industrial use and impact of AutoML with a focus on practical applicability in a business context. At last, we conclude with the open research issues, and future research directions.

[1]  Vikram Pudi,et al.  AutoLearn — Automated Feature Generation and Selection , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[2]  Sudeep Tanwar,et al.  Blockchain for 5G-enabled IoT for industrial automation: A systematic review, solutions, and challenges , 2020, Mechanical Systems and Signal Processing.

[3]  Gideon S. Mann,et al.  Efficient Transfer Learning Method for Automatic Hyperparameter Tuning , 2014, AISTATS.

[4]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[5]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[6]  Steven M. LaValle,et al.  On the Relationship between Classical Grid Search and Probabilistic Roadmaps , 2004, Int. J. Robotics Res..

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Aaron Klein,et al.  Towards Automatically-Tuned Neural Networks , 2016, AutoML@ICML.

[9]  Deepak S. Turaga,et al.  Feature Engineering for Predictive Modeling using Reinforcement Learning , 2017, AAAI.

[10]  Carol M Musil,et al.  A Comparison of Imputation Techniques for Handling Missing Data , 2002, Western journal of nursing research.

[11]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[12]  Dorian Pyle,et al.  Data Preparation for Data Mining , 1999 .

[13]  Sanjay Garg,et al.  Evaluation of Pattern Based Customized Approach for Stock Market Trend Prediction With Big Data and Machine Learning Techniques , 2019, International Journal of Business Analytics.

[14]  Randal S. Olson,et al.  TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning , 2016, AutoML@ICML.

[15]  F. Hutter,et al.  Practical Automated Machine Learning for the AutoML Challenge 2018 , 2018 .

[16]  Sherif Sakr,et al.  SmartML: A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Machine Learning Algorithms , 2019, EDBT.

[17]  Robert C. Holte,et al.  C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling , 2003 .

[18]  Khurana Udayan,et al.  Cognito: Automated Feature Engineering for Supervised Learning , 2016 .

[19]  Hod Lipson,et al.  Autostacker: a compositional evolutionary learning system , 2018, GECCO.

[20]  Jitendra Bhatia,et al.  A Dynamic Model for Load Balancing in Cloud Infrastructure , 2015 .

[21]  Oznur Alkan,et al.  One button machine for automating feature engineering in relational databases , 2017, ArXiv.

[22]  Sherif Sakr,et al.  Automated Machine Learning: State-of-The-Art and Open Challenges , 2019, ArXiv.

[23]  Natalia Miloslavskaya,et al.  Big Data, Fast Data and Data Lake Concepts , 2016, BICA.

[24]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[25]  Dawn Xiaodong Song,et al.  ExploreKit: Automatic Feature Generation and Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[26]  Roger J.-B. Wets,et al.  Minimization by Random Search Techniques , 1981, Math. Oper. Res..

[27]  Madhuri Bhavsar,et al.  Software defined vehicular networks: A comprehensive review , 2019, Int. J. Commun. Syst..

[28]  Jitendra Bhatia,et al.  Linear Regression Assisted Prediction Based Load Balancer For Cloud Computing , 2018, 2018 IEEE Punecon.

[29]  Mohammadreza Amirian,et al.  Automated Machine Learning in Practice: State of the Art and Recent Results , 2019, 2019 6th Swiss Conference on Data Science (SDS).

[30]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[31]  Reza Farivar,et al.  Towards Automated Machine Learning: Evaluation and Comparison of AutoML Approaches and Tools , 2019, 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI).

[32]  Deepak S. Turaga,et al.  Learning Feature Engineering for Classification , 2017, IJCAI.

[33]  Aaron Klein,et al.  Auto-sklearn: Efficient and Robust Automated Machine Learning , 2019, Automated Machine Learning.

[34]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[35]  Madhuri Bhavsar,et al.  Variants of Software Defined Network (SDN) Based Load Balancing in Cloud Computing: A Quick Review , 2017 .

[36]  Kaiyong Zhao,et al.  AutoML: A Survey of the State-of-the-Art , 2019, Knowl. Based Syst..

[37]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[38]  Anand Nayyar,et al.  SDN-based real-time urban traffic analysis in VANET environment , 2020, Comput. Commun..

[39]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.