A Machine Learning Approach to Predicting the Smoothed Complexity of Sorting Algorithms

Smoothed analysis is a framework for analyzing the complexity of an algorithm, acting as a bridge between average and worst-case behaviour. For example, Quicksort and the Simplex algorithm are widely used in practical applications, despite their heavy worst-case complexity. Smoothed complexity aims to better characterize such algorithms. Existing theoretical bounds for the smoothed complexity of sorting algorithms are still quite weak. Furthermore, empirically computing the smoothed complexity via its original definition is computationally infeasible, even for modest input sizes. In this paper, we focus on accurately predicting the smoothed complexity of sorting algorithms, using machine learning techniques. We propose two regression models that take into account various properties of sorting algorithms and some of the known theoretical results in smoothed analysis to improve prediction quality. We show experimental results for predicting the smoothed complexity of Quicksort, Mergesort, and optimized Bubblesort for large input sizes, therefore filling the gap between known theoretical and empirical results.

[1]  D Teng Smoothed Analysis of Algorithms , 2002 .

[2]  Bodo Manthey,et al.  k-Means Has Polynomial Smoothed Complexity , 2009, 2009 50th Annual IEEE Symposium on Foundations of Computer Science.

[3]  Bichen Shi,et al.  Modular smoothed analysis , 2014 .

[4]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[5]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[6]  Shang-Hua Teng,et al.  Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time , 2001, STOC '01.

[7]  John Dunagan,et al.  Smoothed analysis of the perceptron algorithm for linear programming , 2002, SODA '02.

[8]  Kurt Mehlhorn,et al.  Smoothed Analysis of Three Combinatorial Problems , 2003, MFCS.

[9]  Shang-Hua Teng,et al.  Smoothed analysis: an attempt to explain the behavior of algorithms in practice , 2009, CACM.

[10]  Daniel A. Spielman,et al.  Improved smoothed analysis of the shadow vertex simplex method , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[11]  Shang-Hua Teng Smoothed Analysis of Algorithms and Heuristics , 2005, COCOON.

[12]  Daniel A. Spielman The Smoothed Analysis of Algorithms , 2005, FCT.

[13]  Michel Schellekens,et al.  Modular Smoothed Analysis of Median-of-three Quicksort 1 , 2014 .

[14]  A. Christopoulos,et al.  Fitting Models to Biological Data Using Linear and Nonlinear Regression: A Practical Guide to Curve Fitting , 2004 .

[15]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[16]  Michel Schellekens A modular calculus for the average cost of data structuring , 2008 .