Penalty-based aggregation of multidimensional data

Abstract Research in aggregation theory is nowadays still mostly focused on algorithms to summarize tuples consisting of observations in some real interval or of diverse general ordered structures. Of course, in practice of information processing many other data types between these two extreme cases are worth inspecting. This contribution deals with the aggregation of lists of data points in R d for arbitrary d ≥ 1 . Even though particular functions aiming to summarize multidimensional data have been discussed by researchers in data analysis, computational statistics and geometry, there is clearly a need to provide a comprehensive and unified model in which their properties like equivariances to geometric transformations, internality, and monotonicity may be studied at an appropriate level of generality. The proposed penalty-based approach serves as a common framework for all idempotent information aggregation methods, including componentwise functions, pairwise distance minimizers, and data depth-based medians. It also allows for deriving many new practically useful tools.

[1]  Godfried T. Toussaint,et al.  Algorithms for bivariate medians and a fermat-torricelli problem for lines , 2001, CCCG.

[2]  Humberto Bustince,et al.  A Practical Guide to Averaging Functions , 2015, Studies in Fuzziness and Soft Computing.

[3]  Marek Gagolewski,et al.  Spread measures and their relation to aggregation functions , 2015, Eur. J. Oper. Res..

[4]  Bernard De Baets,et al.  Monometrics and their role in the rationalisation of ranking rules , 2017, Inf. Fusion.

[5]  P. J. Huber The 1972 Wald Lecture Robust Statistics: A Review , 1972 .

[6]  D. Donoho,et al.  Breakdown Properties of Location Estimates Based on Halfspace Depth and Projected Outlyingness , 1992 .

[7]  Ronald R. Yager Toward a general theory of information aggregation , 1993, Inf. Sci..

[8]  K. Nordhausen,et al.  Asymptotic theory of the spatial median , 2010 .

[9]  Y. Zuo Projection-based depth functions and associated medians , 2003 .

[10]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decisionmaking , 1988, IEEE Trans. Syst. Man Cybern..

[11]  M. Gagolewski Some Issues in Aggregation of Multidimensional Data , 2015, AGOP.

[12]  Herbert Edelsbrunner,et al.  Algorithms in Combinatorial Geometry , 1987, EATCS Monographs in Theoretical Computer Science.

[13]  C. Witzgall On convex metrics , 1965 .

[14]  Elena Deza,et al.  Encyclopedia of Distances , 2014 .

[15]  Gleb Beliakov,et al.  On Penalty-Based Aggregation Functions and Consensus , 2011, Consensual Processes.

[16]  Ronald R. Yager,et al.  On ordered weighted averaging aggregation operators in multicriteria decision-making , 1988 .

[17]  G. Ducharme,et al.  Uniqueness of the spatial median , 1987 .

[18]  Mustafa Demirci Aggregation operators on partially ordered sets and their categorical foundations , 2006, Kybernetika.

[19]  Przemyslaw Grzegorzewski,et al.  Measures of dispersion for multidimensional data , 2016, Eur. J. Oper. Res..

[20]  Mitio Nagumo Über eine Klasse der Mittelwerte , 1930 .

[21]  Regina Y. Liu On a Notion of Data Depth Based on Random Simplices , 1990 .

[22]  P. Rousseeuw,et al.  Bivariate location depth , 1996 .

[23]  R. Grübel Orthogonalization of multivariate location estimators : The orthomedian , 1996 .

[24]  Regina Y. Liu,et al.  New Nonparametric Tests of Multivariate Locations and Scales Using Data Depth , 2004 .

[25]  Radko Mesiar,et al.  Quantitative weights and aggregation , 2004, IEEE Transactions on Fuzzy Systems.

[26]  Marek Gagolewski,et al.  Hierarchical Clustering via Penalty-Based Aggregation and the Genie Approach , 2016, MDAI.

[27]  Luis Rademacher,et al.  Approximating the centroid is hard , 2007, SCG '07.

[28]  Bernd Gärtner,et al.  Fast and Robust Smallest Enclosing Balls , 1999, ESA.

[29]  Regina Y. Liu,et al.  Multivariate analysis by data depth: descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh) , 1999 .

[30]  Probal Chaudhuri,et al.  On a transformation and re-transformation technique for constructing an affine equivariant multivariate median , 1996 .

[31]  Tim Wilkin,et al.  Weakly Monotonic Averaging Functions , 2015, Int. J. Intell. Syst..

[32]  Gleb Beliakov,et al.  Stability of weighted penalty-based aggregation functions , 2013, Fuzzy Sets Syst..

[33]  Radko Mesiar Fuzzy set approach to the utility, preference relations, and aggregation operators , 2007, Eur. J. Oper. Res..

[34]  P. Rousseeuw,et al.  Constructing the bivariate Tukey median , 1998 .

[35]  Cun-Hui Zhang,et al.  The multivariate L1-median and associated data depth. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Mariano Eriz Aggregation Functions: A Guide for Practitioners , 2010 .

[37]  Greg Aloupis,et al.  Lower Bounds for Computing Statistical Depth , 2002 .

[38]  R. Serfling,et al.  General notions of statistical depth function , 2000 .

[39]  Jesús Manuel Fernández Salido,et al.  Extending Yager's orness concept for the OWA aggregators to other mean operators , 2003, Fuzzy Sets Syst..

[40]  K. Mosler,et al.  Zonoid Data Depth: Theory and Computation , 1996 .

[41]  Bernd Gärtner,et al.  An efficient, exact, and generic quadratic programming solver for geometric optimization , 2000, SCG '00.

[42]  P. Bullen Handbook of means and their inequalities , 1987 .

[43]  Stephane Durocher,et al.  On Combinatorial Depth Measures , 2018, CCCG.

[44]  Timothy M. Chan An optimal randomized algorithm for maximum Tukey depth , 2004, SODA '04.

[45]  Radko Mesiar,et al.  Aggregation functions on bounded partially ordered sets and their classification , 2011, Fuzzy Sets Syst..

[46]  M. Shirosaki Another proof of the defect relation for moving targets , 1991 .

[47]  Humberto Bustince,et al.  Relationship between restricted dissimilarity functions, restricted equivalence functions and normal EN-functions: Image thresholding invariant , 2008, Pattern Recognit. Lett..

[48]  Gleb Beliakov,et al.  A penalty-based aggregation operator for non-convex intervals , 2014, Knowl. Based Syst..

[49]  Christopher G. Small,et al.  A nonparametric multivariate multisample test based on data depth , 2012 .

[50]  Humberto Bustince,et al.  On the definition of penalty functions in data aggregation , 2017, Fuzzy Sets Syst..

[51]  Javier Martín,et al.  Dispersion Measures and Multidistances on \mathbb R^k R k , 2016, SMPS.

[52]  L. Guerrini,et al.  An Extension of Witzgall ’ s Result on Convex Metrics , 2005 .

[53]  Marek Gagolewski,et al.  Fuzzy K-Minpen Clustering and K-nearest-minpen Classification Procedures Incorporating Generic Distance-Based Penalty Minimizers , 2016, IPMU.

[54]  Sergei Ovchinnikov,et al.  Invariant Functions on Simple Orders , 1997 .

[55]  Bernard De Baets,et al.  A ranking procedure based on a natural monotonicity constraint , 2014, Inf. Fusion.

[56]  Gleb Beliakov,et al.  Aggregation functions based on penalties , 2010, Fuzzy Sets Syst..

[57]  W. Eddy Convex Hull Peeling , 1982 .

[58]  H. Oja Descriptive Statistics for Multivariate Distributions , 1983 .

[59]  R. Yager,et al.  UNDERSTANDING THE MEDIAN AS A FUSION OPERATOR , 1997 .

[60]  Roberto Lucchetti,et al.  Convexity and well-posed problems , 2006 .

[61]  Khaled M. Elbassioni,et al.  Complexity of approximating the vertex centroid of a polyhedron , 2012, Theor. Comput. Sci..

[62]  Manuel Abellanas,et al.  Point set stratification and Delaunay depth , 2005, Comput. Stat. Data Anal..

[63]  Humberto Bustince,et al.  Restricted dissimilarity functions and penalty functions , 2011, EUSFLAT Conf..

[64]  S. Ovchinnikov Means on ordered sets , 1996 .

[65]  Greg Aloupis,et al.  Geometric Measures of Data Depth , 2022 .

[66]  Meyer Dwass,et al.  On Infinitely Divisible Random Vectors , 1957 .

[67]  Tim Wilkin,et al.  On some properties of weighted averaging with variable weights , 2014, Inf. Sci..

[68]  C. Small A Survey of Multidimensional Medians , 1990 .

[69]  Peter Rousseeuw,et al.  Computing location depth and regression depth in higher dimensions , 1998, Stat. Comput..

[70]  P. Rousseeuw,et al.  The depth function of a population distribution , 1999, Metrika.