FLASH: A Faster Optimizer for SBSE Tasks

Most problems in search-based software engineering involve balancing conflicting objectives. Prior approaches to this task have required a large number of evaluations- making them very slow to execute and very hard to comprehend. To solve these problems, this paper introduces FLASH, a decision tree based optimizer that incrementally grows one decision tree per objective. These trees are then used to select the next best sample. This paper compares FLASH to state-of-the-art algorithms from search-based SE and machine learning. This comparison uses multiple SBSE case studies for release planning, configuration control, process modeling, and sprint planning for agile development. FLASH was found to be the fastest optimizer (sometimes requiring less than 1% of the evaluations used by evolutionary algorithms). Also, measured in terms of model size, FLASH's reasoning was far more succinct and comprehensible. Further, measured in terms of finding effective optimization, FLASH's recommendations were highly competitive with other approaches. Finally, FLASH scaled to more complex models since it always terminated (while state-of-the-art algorithm did not).

[1]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[2]  Gary B. Lamont,et al.  Multiobjective evolutionary algorithms: classifications, analyses, and new innovations , 1999 .

[3]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[4]  David W. Corne,et al.  Approximating the Nondominated Front Using the Pareto Archived Evolution Strategy , 2000, Evolutionary Computation.

[5]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[6]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[7]  Marco Laumanns,et al.  Scalable Test Problems for Evolutionary Multiobjective Optimization , 2005, Evolutionary Multiobjective Optimization.

[8]  Tim Menzies,et al.  An (Accidental) Exploration of Alternatives to Evolutionary Algorithms for SBSE , 2016, SSBSE.

[9]  Yoshua Bengio,et al.  Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[10]  Sven Apel,et al.  Using bad learners to find good configurations , 2017, ESEC/SIGSOFT FSE.

[11]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[12]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[13]  Barry W. Boehm,et al.  Accurate estimates without local data? , 2009, Softw. Process. Improv. Pract..

[14]  Emmanuel Letier,et al.  Understanding clusters of optimal solutions in multi-objective decision problems , 2011, 2011 IEEE 19th International Requirements Engineering Conference.

[15]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  Tim Menzies,et al.  Beyond evolutionary algorithms for search-based software engineering , 2017, Inf. Softw. Technol..

[18]  Lefteris Angelis,et al.  Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm , 2013, IEEE Transactions on Software Engineering.

[19]  Eckart Zitzler,et al.  HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization , 2011, Evolutionary Computation.

[20]  Eckart Zitzler,et al.  Indicator-Based Selection in Multiobjective Search , 2004, PPSN.

[21]  Giuliano Casale,et al.  An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems , 2016, 2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS).

[22]  Long Jin,et al.  Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software , 2015, ESEC/SIGSOFT FSE.

[23]  Jasper Snoek,et al.  Bayesian Optimization with Unknown Constraints , 2014, UAI.

[24]  Forrest Shull,et al.  Local versus Global Lessons for Defect Prediction and Effort Estimation , 2013, IEEE Transactions on Software Engineering.

[25]  Sven Apel,et al.  Variability-aware performance prediction: A statistical learning approach , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[26]  Carlos A. Coello Coello,et al.  A Study of the Parallelization of a Coevolutionary Multi-objective Evolutionary Algorithm , 2004, MICAI.

[27]  Andreas Krause,et al.  Active Learning for Multi-Objective Optimization , 2013, ICML.

[28]  Tim Menzies,et al.  GALE: Geometric Active Learning for Search-Based Software Engineering , 2015, IEEE Transactions on Software Engineering.

[29]  Kalyanmoy Deb,et al.  An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints , 2014, IEEE Transactions on Evolutionary Computation.

[30]  Sven Apel,et al.  Views on Internal and External Validity in Empirical Software Engineering , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[31]  Mark Harman,et al.  Search Based Software Engineering: Techniques, Taxonomy, Tutorial , 2010, LASER Summer School.

[32]  Barry W. Boehm,et al.  How to avoid drastic software process change (using stochastic stability) , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[33]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[34]  M. Zuluaga,et al.  ε-PAL: an active learning approach to the multi-objective optimization problem , 2016 .

[35]  Tim Menzies,et al.  Using Simulation to Investigate Requirements Prioritization Strategies , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[36]  Hisao Ishibuchi,et al.  A multi-objective genetic local search algorithm and its application to flowshop scheduling , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[37]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[38]  Mark Harman,et al.  Searching for better configurations: a rigorous approach to clone evaluation , 2013, ESEC/FSE 2013.

[39]  Kevin Leyton-Brown,et al.  Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[40]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[41]  Nando de Freitas,et al.  Bayesian Optimization in a Billion Dimensions via Random Embeddings , 2013, J. Artif. Intell. Res..

[42]  Gunter Saake,et al.  Predicting performance via automated feature-interaction detection , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[43]  Yan Li,et al.  A Practical Guide to Select Quality Indicators for Assessing Pareto-Based Search Algorithms in Search-Based Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).