“Sampling” as a Baseline Optimizer for Search-Based Software Engineering

Increasingly, Software Engineering (SE) researchers use search-based optimization techniques to solve SE problems with multiple conflicting objectives. These techniques often apply CPU-intensive evolutionary algorithms to explore generations of mutations to a population of candidate solutions. An alternative approach, proposed in this paper, is to start with a very large population and sample down to just the better solutions. We call this method “Sway”, short for “the sampling way”. This paper compares Sway versus state-of-the-art search-based SE tools using seven models: five software product line models; and two other software process control models (concerned with project management, effort estimation, and selection of requirements) during incremental agile development. For these models, the experiments of this paper show that Sway is competitive with corresponding state-of-the-art evolutionary algorithms while requiring orders of magnitude fewer evaluations. Considering the simplicity and effectiveness of Sway, we, therefore, propose this approach as a baseline method for search-based software engineering models, especially for models that are very slow to execute.

[1]  Thomas Jansen,et al.  On the analysis of the (1+1) evolutionary algorithm , 2002, Theor. Comput. Sci..

[2]  A. Keane,et al.  Evolutionary Optimization of Computationally Expensive Problems via Surrogate Modeling , 2003 .

[3]  Sven Kosub,et al.  A note on the triangle inequality for the Jaccard distance , 2016, Pattern Recognit. Lett..

[4]  Yuanyuan Zhang,et al.  Search based software engineering for software product line engineering: a survey and directions for future work , 2014, SPLC.

[5]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[6]  Gunter Saake,et al.  Predicting performance via automated feature-interaction detection , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[7]  Marijn J. H. Heule,et al.  SAT Competition 2016: Recent Developments , 2017, AAAI.

[8]  Yan Li,et al.  A Practical Guide to Select Quality Indicators for Assessing Pareto-Based Search Algorithms in Search-Based Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[9]  Tim Menzies,et al.  Using Simulation to Investigate Requirements Prioritization Strategies , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[10]  Mark Harman,et al.  Not going to take this anymore: Multi-objective overtime planning for Software Engineering projects , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[11]  Qingfu Zhang,et al.  MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition , 2007, IEEE Transactions on Evolutionary Computation.

[12]  Tim Menzies,et al.  RIOT: A Stochastic-Based Method for Workflow Scheduling in the Cloud , 2017, 2018 IEEE 11th International Conference on Cloud Computing (CLOUD).

[13]  Kalyanmoy Deb,et al.  A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II , 2000, PPSN.

[14]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[15]  Armin Biere,et al.  SAT Race 2015 , 2016, Artif. Intell..

[16]  Tim Menzies,et al.  Beyond evolutionary algorithms for search-based software engineering , 2017, Inf. Softw. Technol..

[17]  Guilherme Horta Travassos,et al.  Cross versus Within-Company Cost Estimation Studies: A Systematic Review , 2007, IEEE Transactions on Software Engineering.

[18]  DorigoMarco,et al.  A survey on metaheuristics for stochastic combinatorial optimization , 2009 .

[19]  Tim Menzies,et al.  Learning Mitigations for Pilot Issues When Landing Aircraft (via Multiobjective Optimization and Multiagent Simulations) , 2016, IEEE Transactions on Human-Machine Systems.

[20]  Stephen F. Smith,et al.  Modeling GA Performance for Control Parameter Optimization , 2000, GECCO.

[21]  Mark Harman,et al.  Searching for better configurations: a rigorous approach to clone evaluation , 2013, ESEC/FSE 2013.

[22]  Arnaud Gotlieb,et al.  Minimizing test suites in software product lines using weight-based genetic algorithms , 2013, GECCO '13.

[23]  Bojan Cukic,et al.  An alternative to model checking: verification by random search of AND-OR graphs representing finite-state models , 2002, 7th IEEE International Symposium on High Assurance Systems Engineering, 2002. Proceedings..

[24]  Tim Menzies,et al.  Learning the Task Management Space of an Aircraft Approach Model , 2014, AAAI Spring Symposia.

[25]  Barry W. Boehm,et al.  The business case for automated software engineering , 2007, ASE.

[26]  Barry W. Boehm,et al.  Using Risk to Balance Agile and Plan-Driven Methods , 2003, Computer.

[27]  Krzysztof Czarnecki,et al.  A Study of Variability Models and Languages in the Systems Software Domain , 2013, IEEE Transactions on Software Engineering.

[28]  Andreas Krause,et al.  Active Learning for Multi-Objective Optimization , 2013, ICML.

[29]  David A. Van Veldhuizen,et al.  Evolutionary Computation and Convergence to a Pareto Front , 1998 .

[30]  Joseph Krall,et al.  Faster Evolutionary Multi-Objective Optimization via GALE, the Geometric Active Learner , 2014 .

[31]  Sanjoy Dasgupta,et al.  Random projection trees and low dimensional manifolds , 2008, STOC.

[32]  Wolfgang Banzhaf,et al.  Genetic Programming: An Introduction , 1997 .

[33]  Tim Menzies,et al.  Easy over hard: a case study on deep learning , 2017, ESEC/SIGSOFT FSE.

[34]  Qingfu Zhang,et al.  Combining Model-based and Genetics-based Offspring Generation for Multi-objective Optimization Using a Convergence Criterion , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[35]  Barry W. Boehm,et al.  Accurate estimates without local data? , 2009, Softw. Process. Improv. Pract..

[36]  Emmanuel Letier,et al.  Understanding clusters of optimal solutions in multi-objective decision problems , 2011, 2011 IEEE 19th International Requirements Engineering Conference.

[37]  Claire Le Goues,et al.  Improved Crossover Operators for Genetic Programming for Program Repair , 2016, SSBSE.

[38]  Tim Menzies,et al.  Applications of abduction: hypothesis testing of neuroendocrinological qualitative compartmental models , 1997, Artif. Intell. Medicine.

[39]  Tim Menzies,et al.  An (Accidental) Exploration of Alternatives to Evolutionary Algorithms for SBSE , 2016, SSBSE.

[40]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[41]  Sean Quan Lau Domain Analysis of E-Commerce Systems Using Feature-Based Model Templates , 2006 .

[42]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[43]  Donald D. Cowan,et al.  Decision-making coordination in collaborative product configuration , 2008, SAC '08.

[44]  Shane McIntosh,et al.  Automated Parameter Optimization of Classification Techniques for Defect Prediction Models , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[45]  S. She,et al.  Variability Modeling in the Systems Software Domain , 2012 .

[46]  Paul R. Cohen,et al.  Empirical methods for artificial intelligence , 1995, IEEE Expert.

[47]  Sergio Segura,et al.  SIP: Optimal Product Selection from Feature Models Using Many-Objective Evolutionary Optimization , 2016, ACM Trans. Softw. Eng. Methodol..

[48]  Chih-Jen Lin,et al.  Radius Margin Bounds for Support Vector Machines with the RBF Kernel , 2002, Neural Computation.

[49]  Krzysztof Czarnecki,et al.  Reverse engineering feature models , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[50]  Lefteris Angelis,et al.  Ranking and Clustering Software Cost Estimation Models through a Multiple Comparisons Algorithm , 2013, IEEE Transactions on Software Engineering.

[51]  Peter A. Whigham,et al.  A Baseline Model for Software Effort Estimation , 2015, TSEM.

[52]  Marco Laumanns,et al.  Scalable Test Problems for Evolutionary Multiobjective Optimization , 2005, Evolutionary Multiobjective Optimization.

[53]  Tim Menzies,et al.  GALE: Geometric Active Learning for Search-Based Software Engineering , 2015, IEEE Transactions on Software Engineering.

[54]  Tim Menzies,et al.  XOMO: Understanding Development Options for Autonomy , 2005 .

[55]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[56]  Abdel Salam Sayyad,et al.  Pareto-optimal search-based software engineering (POSBSE): A literature survey , 2013, 2013 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[57]  Erick Cantú-Paz,et al.  Adaptive Sampling for Noisy Problems , 2004, GECCO.

[58]  Stephen G. MacDonell,et al.  Evaluating prediction systems in software project estimation , 2012, Inf. Softw. Technol..

[59]  Ping Wang,et al.  Optimal control based regression test selection for service-oriented workflow applications , 2017, J. Syst. Softw..

[60]  Tim Menzies,et al.  RIOT: a Novel Stochastic Method for Rapidly Configuring Cloud-Based Workflows , 2017, ArXiv.

[61]  Yves Le Traon,et al.  Combining Multi-Objective Search and Constraint Solving for Configuring Large Software Product Lines , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[62]  Shane McIntosh,et al.  Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models , 2015, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering.

[63]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[64]  Barry W. Boehm,et al.  How to avoid drastic software process change (using stochastic stability) , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[65]  Tim Menzies,et al.  Tuning for Software Analytics: is it Really Necessary? , 2016, Inf. Softw. Technol..

[66]  Dan Boneh,et al.  On genetic algorithms , 1995, COLT '95.

[67]  Mark Harman,et al.  An empirical study of the robustness of two module clustering fitness functions , 2005, GECCO '05.

[68]  Katsuro Inoue,et al.  Search-based software library recommendation using multi-objective optimization , 2017, Inf. Softw. Technol..

[69]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[70]  Danilo Ardagna,et al.  A Multi-model Optimization Framework for the Model Driven Design of Cloud Applications , 2014, SSBSE.

[71]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[72]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[73]  Tim Menzies,et al.  On the value of user preferences in search-based software engineering: A case study in software product lines , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[74]  Naveen Kumar Lekkalapudi Cross Trees: Visualizing Estimations using Decision Trees , 2014 .

[75]  Mark Harman,et al.  GPGPU test suite minimisation: search based software engineering performance improvement using graphics cards , 2013, Empirical Software Engineering.

[76]  Bojan Cukic,et al.  Caveats , 2020, The African Continental Free Trade Area: Economic and Distributional Effects.

[77]  Gordon Fraser,et al.  Parameter tuning or default values? An empirical investigation in search-based software engineering , 2013, Empirical Software Engineering.

[78]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[79]  LinChih-Jen,et al.  Radius margin bounds for support vector machines with the RBF kernel , 2003 .

[80]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[81]  Mark Harman,et al.  Adaptive Multi-Objective Evolutionary Algorithms for Overtime Planning in Software Projects , 2017, IEEE Transactions on Software Engineering.

[82]  Robert F. Cohen,et al.  Applications of Abduction: Testing Very Long Qualitative Simulations , 2002, IEEE Trans. Knowl. Data Eng..

[83]  Mark Harman,et al.  Less is More: Temporal Fault Predictive Performance over Multiple Hadoop Releases , 2014, SSBSE.

[84]  John Platt,et al.  FastMap, MetricMap, and Landmark MDS are all Nystrom Algorithms , 2005, AISTATS.

[85]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[86]  Tim Menzies,et al.  Scalable product line configuration: A straw to break the camel's back , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[87]  Risto Miikkulainen,et al.  Estimating the Advantage of Age-Layering in Evolutionary Algorithms , 2016, GECCO.

[88]  Luca Maria Gambardella,et al.  A survey on metaheuristics for stochastic combinatorial optimization , 2009, Natural Computing.

[89]  Eckart Zitzler,et al.  Indicator-Based Selection in Multiobjective Search , 2004, PPSN.

[90]  Enrique Alba,et al.  SMPSO: A new PSO-based metaheuristic for multi-objective optimization , 2009, 2009 IEEE Symposium on Computational Intelligence in Multi-Criteria Decision-Making(MCDM).

[91]  Ellis Horowitz,et al.  Software Cost Estimation with COCOMO II , 2000 .

[92]  Bojan Cukic,et al.  What makes finite-state models more (or less) testable? , 2002, Proceedings 17th IEEE International Conference on Automated Software Engineering,.

[93]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .