Meta-Modelling Meta-Learning

Although artificial intelligence and machine learning are currently extremely fashionable, applying machine learning on real-life problems remains very challenging. Data scientists need to evaluate various learning algorithms and tune their numerous parameters, based on their assumptions and experience, against concrete problems and training data sets. This is a long, tedious, and resource expensive task. Meta-learning is a recent technique to overcome, i.e. automate this problem. It aims at using machine learning itself to automatically learn the most appropriate algorithms and parameters for a machine learning problem. As it turns out, there are many parallels between meta-modelling—in the sense of model-driven engineering—and meta-learning. Both rely on abstractions, the meta data, to model a predefined class of problems and to define the variabilities of the models conforming to this definition. Both are used to define the output and input relationships and then fitting the right models to represent that behaviour. In this paper, we envision how a meta-model for meta-learning can look like. We discuss possible variabilities, for what types of learning it could be appropriate for, how concrete learning models can be generated from it, and how models can be finally selected. Last but not least, we discuss a possible integration into existing modelling tools.

[1]  Asja Fischer,et al.  Training Restricted Boltzmann Machines , 2015, KI - Künstliche Intelligenz.

[2]  Travis Desell,et al.  Large scale evolution of convolutional neural networks using volunteer computing , 2017, GECCO.

[3]  Yves Le Traon,et al.  The next evolution of MDE: a seamless integration of machine learning into domain modeling , 2017, 2017 ACM/IEEE 20th International Conference on Model Driven Engineering Languages and Systems (MODELS).

[4]  Khurana Udayan,et al.  Cognito: Automated Feature Engineering for Supervised Learning , 2016 .

[5]  Hod Lipson,et al.  Autostacker: a compositional evolutionary learning system , 2018, GECCO.

[6]  Lars Schmidt-Thieme,et al.  Scalable Hyperparameter Optimization with Products of Gaussian Process Experts , 2016, ECML/PKDD.

[7]  Qingquan Song,et al.  Efficient Neural Architecture Search with Network Morphism , 2018, ArXiv.

[8]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[9]  David Rolnick,et al.  How to Start Training: The Effect of Initialization and Architecture , 2018, NeurIPS.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[12]  Mykola Pechenizkiy,et al.  An Overview of Concept Drift Applications , 2016 .

[13]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[14]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[15]  Elliot Meyerson,et al.  Evolutionary neural AutoML for deep learning , 2019, GECCO.

[16]  Quoc V. Le,et al.  Neural Architecture Search with Reinforcement Learning , 2016, ICLR.

[17]  Wojciech Zaremba,et al.  An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.

[18]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[19]  Vikram Pudi,et al.  AutoLearn — Automated Feature Generation and Selection , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[20]  Prabhat,et al.  Scalable Bayesian Optimization Using Deep Neural Networks , 2015, ICML.

[21]  Deepak S. Turaga,et al.  Feature Engineering for Predictive Modeling using Reinforcement Learning , 2017, AAAI.

[22]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[23]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[24]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Sebti Foufou,et al.  A neural network meta-model and its application for manufacturing , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[26]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[27]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[28]  Dawn Xiaodong Song,et al.  ExploreKit: Automatic Feature Generation and Selection , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[29]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[30]  Aaron Klein,et al.  Bayesian Optimization with Robust Bayesian Neural Networks , 2016, NIPS.

[31]  Sébastien Gérard,et al.  Cognifying Model-Driven Software Engineering , 2017, STAF Workshops.

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).