Outlier Detection for Learning-Based Optimizing Compiler

Modern compilers use machine learning to find from their prior experience useful heuristics for new programs encountered in order to accelerate the optimization process. However, prior experience might not be applicable for outlier programs with unfamiliar code features. This paper presents a Reverse K-nearest neighbor (RKNN) algorithm based approach for outlier detection. The compiler can therefore launch a search within an optimization space when outlier programs are encountered, or directly apply its experience to non-outliers. Preliminary experimental results demonstrate the effectiveness of the approach.

[1]  Michael F. P. O'Boyle,et al.  Using machine learning to focus iterative optimization , 2006, International Symposium on Code Generation and Optimization (CGO'06).

[2]  François Bodin,et al.  A Machine Learning Approach to Automatic Production of Compiler Heuristics , 2002, AIMSA.

[3]  Shashi Shekhar,et al.  Continuous Evaluation of Monochromatic and Bichromatic Reverse Nearest Neighbors , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Michael F. P. O'Boyle,et al.  Fast compiler optimisation evaluation using code-feature based performance prediction , 2007, CF '07.

[5]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[6]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[7]  Xiang Lian,et al.  Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data , 2009, The VLDB Journal.

[8]  Grigori Fursin,et al.  A Cost-Aware Parallel Workload Allocation Approach Based on Machine Learning Techniques , 2007, NPC.

[9]  Michael F. P. O'Boyle,et al.  Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping , 2009, PLDI '09.

[10]  Grigori Fursin,et al.  Probabilistic source-level optimisation of embedded programs , 2005, LCTES '05.

[11]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[12]  Michael F. P. O'Boyle,et al.  Automatic performance model construction for the fast software exploration of new hardware designs , 2006, CASES '06.

[13]  Man Lung Yiu,et al.  Reverse Nearest Neighbors Search in Ad Hoc Subspaces , 2006, IEEE Transactions on Knowledge and Data Engineering.