Adaptive relaxed synchronization through the use of supervised learning methods

Abstract Several authors have proposed the use of relaxed synchronization to speed up the execution of parallel applications that admit tradeoffs between quality and execution time. However, most of these works propose the complete removal of synchronization primitives and do not anticipate the quality of the results to be obtained with different input data. In this paper, we propose a novel strategy for relaxing synchronization, evaluating the feasibility of using supervised learning methods to ensure that the relaxed synchronization technique provides results within acceptable limits of error. We use a varied set of program inputs to create a control base, providing data for the training of supervised learning methods. When the user wishes to execute his/her application with new input data (in the same execution environment), the trained classification algorithm will suggest the relax factor that is best suited for the triple application/input/execution environment. Using this methodology, we obtained a gain of 3.5x for the K-means algorithm applied to videos while maintaining the desired quality rate.

[1]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[2]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[3]  Sumit Gulwani,et al.  Proving programs robust , 2011, ESEC/FSE '11.

[4]  Vijayalakshmi Srinivasan,et al.  Programming with relaxed synchronization , 2012, RACES '12.

[5]  Hans-J. Boehm Position paper: nondeterminism is unavoidable, but data races are pure evil , 2012, RACES '12.

[6]  Hans-Juergen Boehm,et al.  Foundations of the C++ concurrency memory model , 2008, PLDI '08.

[7]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[8]  Emrah Hancer,et al.  A new approach to the reconstruction of contour lines extracted from topographic maps , 2012, J. Vis. Commun. Image Represent..

[9]  Henry Hoffmann,et al.  Managing performance vs. accuracy trade-offs with loop perforation , 2011, ESEC/FSE '11.

[10]  C. Y. Lee An Algorithm for Path Connections and Its Applications , 1961, IRE Trans. Electron. Comput..

[11]  John Sartori,et al.  Branch and Data Herding: Reducing Control and Memory Divergence for Error-Tolerant GPU Applications , 2013, IEEE Trans. Multim..

[12]  Scott A. Mahlke,et al.  SAGE: Self-tuning approximation for graphics engines , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Stelios Sidiroglou,et al.  Dancing with uncertainty , 2012, RACES '12.

[15]  Scott A. Mahlke,et al.  Paraprox: pattern-based approximation for data parallel applications , 2014, ASPLOS.

[16]  Martin Rinard,et al.  Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures , 2009 .