We present a two-parameter loss function which can be viewed as a generalization of many popular loss functions used in robust statistics: the Cauchy/Lorentzian, GemanMcClure, Welsch, and generalized Charbonnier loss functions (and by transitivity the L2, L1, L1-L2, and pseudoHuber/Charbonnier loss functions). We describe and visualize this loss, and document several of its useful properties. Many problems in statistics [8] and optimization [6] require robustness — that a model be insensitive to outliers. This idea is often used in parameter estimation tasks, where a non-robust loss function such as the L2 norm is replaced with some most robust alternative in the face of non-Gaussian noise. Practitioners, especially in the image processing and computer vision literature, have developed a large collection of different robust loss functions with different parametrizations and properties (some of which are summarized well in [2, 13]). These loss functions are often used within gradient-descent or second-order methods, or as part of M-estimation or some more specialized optimization approach. Unless the optimization strategy is co-designed with the loss being minimized, these losses are often “plug and play”: only a loss and its gradient is necessary to integrate a new loss function into an existing system. When designing new models or experimenting with different design choices, practitioners often swap in different loss functions to see how they behave. In this paper we present a single loss function that is a superset of many of these common loss functions. A single continuous-valued parameter in our loss function can be set such our loss is exactly equal to several traditional loss functions, but can also be tuned arbitrarily to model a wider family of loss functions. As as result, this loss may be useful to practitioners wishing to easily and continuously explore a wide variety of robust loss functions.
[1]
Michael J. Black,et al.
The outlier process: unifying line processes and robust statistics
,
1994,
1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[2]
Vladlen Koltun,et al.
Fast MRF Optimization with Application to Depth Reconstruction
,
2014,
2014 IEEE Conference on Computer Vision and Pattern Recognition.
[3]
Yvan G. Leclerc,et al.
Constructing simple stable descriptions for image partitioning
,
1989,
International Journal of Computer Vision.
[4]
Zhengyou Zhang,et al.
Parameter estimation techniques: a tutorial with application to conic fitting
,
1997,
Image Vis. Comput..
[5]
Stuart Geman,et al.
Statistical methods for tomographic image reconstruction
,
1987
.
[6]
S. Nadarajah.
A generalized normal distribution
,
2005
.
[7]
Frederick R. Forst,et al.
On robust estimation of the location parameter
,
1980
.
[8]
Michael J. Black,et al.
The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields
,
1996,
Comput. Vis. Image Underst..
[9]
Michael J. Black,et al.
Secrets of optical flow estimation and their principles
,
2010,
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[10]
J. Dennis,et al.
Techniques for nonlinear least squares and robust regression
,
1978
.
[11]
Vladlen Koltun,et al.
Efficient Nonlocal Regularization for Optical Flow
,
2012,
ECCV.
[12]
Trevor Hastie,et al.
Statistical Learning with Sparsity: The Lasso and Generalizations
,
2015
.
[13]
B. Ripley,et al.
Robust Statistics
,
2018,
Encyclopedia of Mathematical Geosciences.
[14]
Michel Barlaud,et al.
Two deterministic half-quadratic regularization algorithms for computed imaging
,
1994,
Proceedings of 1st International Conference on Image Processing.