Evaluation methodology for complex non-deterministic functions: A case study in metaheuristic optimization of caches

When evolving a non-deterministic function by Evolutionary Algorithms, a candidate solution is usually evaluated multiple times to estimate its characteristic behavior. This is a valid methodology unless the evaluation is too complex and the fitness evaluations result in unacceptably long optimization times. This challenge can be addressed by either resorting to a simpler surrogate performance model or, in case a surrogate model is not precise enough, by parallelizing search, or by minimizing the number of fitness evaluations if the impact on the quality of search is acceptable. The work presented in this paper is motivated by the optimization of processor caches, for which performance evaluation is highly complex and nondeterministic due to the non-deterministic behavior of today's operating systems. Since parallelizing fitness evaluations results in unacceptably prolonged computation times, we employ statistical methods to identify best-performing candidates using as few fitness evaluations as possible. We describe different approaches we have investigated until finally selecting the Wilcoxon rank-sum to adaptively control a fitness evaluation scheme. With this novel scheme we are able to reduce the optimization times by a factor of 3.6 without significant drop in convergence behavior.

[1]  Alberto Ros,et al.  Adaptive Selection of Cache Indexing Bits for Removing Conflict Misses , 2015, IEEE Trans. Computers.

[2]  Mahmut T. Kandemir,et al.  Leakage energy management in cache hierarchies , 2002, Proceedings.International Conference on Parallel Architectures and Compilation Techniques.

[3]  Tony Givargis Improved indexing for cache miss reduction in embedded systems , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[4]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[5]  Leslie Pérez Cáceres,et al.  The irace package: Iterated racing for automatic algorithm configuration , 2016 .

[6]  K. Kavi Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .

[7]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[8]  Frank Vahid,et al.  A self-tuning cache architecture for embedded systems , 2004 .

[9]  Jean-Didier Legat,et al.  Application-Specific Reconfigurable XOR-Indexing to Eliminate Cache Conflict Misses , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[10]  Y. Volkan Pehlivanoglu,et al.  Aerodynamic design prediction using surrogate-based modeling in genetic algorithm architecture , 2012 .

[11]  Luca Benini,et al.  Reducing cache misses by application-specific re-configurable indexing , 2004, ICCAD 2004.