Uniform and scalable SAT-sampling for configurable systems

Several relevant analyses on configurable software systems remain intractable because they require examining vast and highly-constrained configuration spaces. Those analyses could be addressed through statistical inference, i.e., working with a much more tractable sample that later supports generalizing the results obtained to the entire configuration space. To make this possible, the laws of statistical inference impose an indispensable requirement: each member of the population must be equally likely to be included in the sample, i.e., the sampling process needs to be "uniform". Various SAT-samplers have been developed for generating uniform random samples at a reasonable computational cost. Unfortunately, there is a lack of experimental validation over large configuration models to show whether the samplers indeed produce genuine uniform samples or not. This paper (i) presents a new statistical test to verify to what extent samplers accomplish uniformity and (ii) reports the evaluation of four state-of-the-art samplers: Spur, QuickSampler, Unigen2, and Smarch. According to our experimental results, only Spur satisfies both scalability and uniformity.

[1]  Laura M. Chihara,et al.  Mathematical Statistics with Resampling and R , 2011 .

[2]  Don S. Batory,et al.  Feature Models, Grammars, and Propositional Formulas , 2005, SPLC.

[3]  Daniël Lakens,et al.  Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs , 2013, Front. Psychol..

[4]  Alexander Egyed,et al.  C2O configurator: a tool for guided decision-making , 2012, Automated Software Engineering.

[5]  Sven Apel,et al.  Data-efficient performance learning for configurable systems , 2018, Empirical Software Engineering.

[6]  Jean-Marc Jézéquel,et al.  Using machine learning to infer constraints for product lines , 2016, SPLC.

[7]  Toby Walsh,et al.  Handbook of Satisfiability: Volume 185 Frontiers in Artificial Intelligence and Applications , 2009 .

[8]  Jürgen Becker,et al.  Multiprocessor System-on-Chip - Hardware Design and Tool Integration , 2011, Multiprocessor System-on-Chip.

[9]  R. D'Agostino,et al.  Goodness-of-Fit-Techniques , 1987 .

[10]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[11]  Mathieu Acher,et al.  Uniform Sampling of SAT Solutions for Configurable Systems: Are We There Yet? , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[12]  Don Batory,et al.  t-wise Coverage by Uniform Sampling , 2019, SPLC.

[13]  Alexander Egyed,et al.  Supporting the Statistical Analysis of Variability Models , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[14]  Yehuda Naveh,et al.  Constraint-Based Random Stimuli Generation for Hardware Verification , 2006, AI Mag..

[15]  Alexander Egyed,et al.  A Kconfig Translation to Logic with One-Way Validation System , 2019, SPLC.

[16]  Sebastian Krieter Enabling Efficient Automated Configuration Generation and Management , 2019, SPLC.

[17]  Supratik Chakraborty,et al.  A Scalable and Nearly Uniform Generator of SAT Witnesses , 2013, CAV.

[18]  Sanjit A. Seshia,et al.  On Parallel Scalable Uniform SAT Witness Generation , 2015, TACAS.

[19]  Ralph B. D'Agostino,et al.  Goodness-of-Fit-Techniques , 2020 .

[20]  Igor L. Markov,et al.  MINCE: A Static Global Variable-Ordering Heuristic for SAT Search and BDD Manipulation , 2004, J. Univers. Comput. Sci..

[21]  Robert Kabacoff,et al.  R in Action: Data Analysis and Graphics with R , 2015 .

[22]  Marc Thurley,et al.  sharpSAT - Counting Models with Advanced Component Caching and Implicit BCP , 2006, SAT.

[23]  Dimitris Achlioptas,et al.  Fast Sampling of Perfectly Uniform Satisfying Assignments , 2018, SAT.

[24]  Sebastian Krieter,et al.  Propagating Configuration Decisions with Modal Implication Graphs , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[25]  Christoph Meinel,et al.  Algorithms and Data Structures in VLSI Design: OBDD - Foundations and Applications , 2012 .

[26]  Jean-Marc Jézéquel,et al.  Sampling Effect on Performance Prediction of Configurable Systems: A Case Study , 2020, ICPE.

[27]  Christoph Meinel,et al.  Optimizing the Variable Order , 1998 .

[28]  Adnan Aziz,et al.  Simplifying Constraint Solving in Random Simulation Generation , 2002, International Workshop on Logic & Synthesis.

[29]  Sven Apel,et al.  Distance-Based Sampling of Software Configuration Spaces , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[30]  Mathieu Acher,et al.  Test them all, is it worth it? Assessing configuration sampling on the JHipster Web development stack , 2017, Empirical Software Engineering.

[31]  Donald D. Cowan,et al.  Efficient compilation techniques for large scale feature models , 2008, GPCE '08.

[32]  Don S. Batory,et al.  Finding near-optimal configurations in product lines by random sampling , 2017, ESEC/SIGSOFT FSE.

[33]  Toby Walsh,et al.  Constraint and Variable Ordering Heuristics for Compiling Configuration Problems , 2007, IJCAI.

[34]  Carlos Cerrada,et al.  A Scalable Approach to Exact Model and Commonality Counting for Extended Feature Models , 2014, IEEE Transactions on Software Engineering.

[35]  Mustafa Al-Hajjaji,et al.  A classification of product sampling for software product lines , 2018, SPLC.

[36]  Carlos Cerrada,et al.  Efficient Identification of Core and Dead Features in Variability Models , 2015, IEEE Access.

[37]  Mathieu Acher,et al.  Modeling variability in the video domain: language and experience report , 2018, Software Quality Journal.

[38]  Christian Becker,et al.  Optimal reconfiguration of dynamic software product lines based on performance-influence models , 2018, SPLC.

[39]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[40]  Koushik Sen,et al.  Efficient Sampling of SAT Solutions for Testing , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[41]  Sebastian Krieter,et al.  YASA: yet another sampling algorithm , 2020, VaMoS.

[42]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[43]  I. Grosse,et al.  Analysis of symbolic sequences using the Jensen-Shannon divergence. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[44]  Igor L. Markov,et al.  FORCE: a fast and easy-to-implement variable-ordering heuristic , 2003, GLSVLSI '03.

[45]  Mihir Bellare,et al.  Uniform Generation of NP-Witnesses Using an NP-Oracle , 2000, Inf. Comput..

[46]  Sven Apel,et al.  Tradeoffs in modeling performance of highly configurable software systems , 2018, Software & Systems Modeling.

[47]  Sebastian Krieter,et al.  Product Sampling for Product Lines: The Scalability Challenge , 2019, SPLC.

[48]  Sven Apel,et al.  Using bad learners to find good configurations , 2017, ESEC/SIGSOFT FSE.

[49]  Mónica Pinto,et al.  Uniform Random Sampling Product Configurations of Feature Models That Have Numerical Features , 2019, SPLC.

[50]  Kuldeep S. Meel,et al.  On Testing of Uniform Samplers , 2019, AAAI.

[51]  Rahul Gupta,et al.  Knowledge Compilation meets Uniform Sampling , 2018, LPAR.

[52]  Marcilio Mendonça,et al.  Efficient Reasoning Techniques for Large Scale Feature Models , 2009 .

[53]  Sharad Malik,et al.  On computing minimal independent support and its applications to sampling and counting , 2015, Constraints.

[54]  Subhajit Roy,et al.  Bug synthesis: challenging bug-finding tools with deep faults , 2018, ESEC/SIGSOFT FSE.

[55]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Tuples and Permutations (Art of Computer Programming) , 2005 .