A new adaptive sampling approach for Genetic Programming

Genetic Programming (GP) is afflicted by an excessive computation time that is more exacerbated with data intensive problems. This issue has been addressed with different approaches such as sampling techniques or distributed implementations. In this paper, we focus on dynamic sampling algorithms that mostly give to GP learner a new sample each generation. In so doing, individuals do not have enough time to extract the hidden knowledge. We propose adaptive sampling which is half-way between static and dynamic methods. It is a flexible approach applicable to any dynamic sampling. We implemented some variants based on controlling re-sampling frequency that we experimented to solve KDD intrusion detection problem with GP. The experimental study demonstrates how it preserves the power of dynamic sampling with possible improvements in learning time and quality for some sampling algorithms. This work opens many new relevant extension paths.

[1]  Malcolm I. Heywood,et al.  Towards Efficient Training on Large Datasets for Genetic Programming , 2004, Canadian AI.

[2]  Amel Borgi,et al.  Hierarchical Data Topology Based Selection for Large Scale Learning , 2016, 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld).

[3]  Christopher Gathercole,et al.  An investigation of supervised learning in genetic programming , 1998 .

[4]  H. Iba Bagging, Boosting, and bloating in Genetic Programming , 1999 .

[5]  Peter Ross,et al.  Dynamic Training Subset Selection for Supervised Learning in Genetic Programming , 1994, PPSN.

[6]  Wolfgang Banzhaf,et al.  Implementing cartesian genetic programming classifiers on graphics processing units using GPU.NET , 2011, GECCO.

[7]  Cyril Fonlupt,et al.  Exploring Overfitting in Genetic Programming , 2003, Artificial Evolution.