Approximation algorithms for stochastic clustering

We consider stochastic settings for clustering, and develop provably-good (approximation) algorithms for a number of these notions. These algorithms allow one to obtain better approximation ratios compared to the usual deterministic clustering setting. Additionally, they offer a number of advantages including providing fairer clustering and clustering which has better long-term behavior for each user. In particular, they ensure that *every user* is guaranteed to get good service (on average). We also complement some of these with impossibility results.

[1]  Dan Feldman,et al.  A PTAS for k-means clustering based on weak coresets , 2007, SCG '07.

[2]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination , 2014, ArXiv.

[3]  Toniann Pitassi,et al.  Learning Fair Representations , 2013, ICML.

[4]  Jirí Matousek,et al.  On Approximate Geometric k -Clustering , 2000, Discret. Comput. Geom..

[5]  Aravind Srinivasan,et al.  Distributions on level-sets with applications to approximation algorithms , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[6]  Lilian Magalhães,et al.  Immigrant women's experiences of receiving care in a mobile health clinic. , 2010, Journal of advanced nursing.

[7]  Jaron Lanier,et al.  Who Owns the Future , 2013 .

[8]  M. Banaji,et al.  Race Effects on Ebay , 2011 .

[9]  Jian Li,et al.  Stochastic k-Center and j-Flat-Center Problems , 2016, SODA.

[10]  J. Lanier,et al.  Should We Treat Data as Labor? Moving Beyond 'Free' , 2017 .

[11]  Shi Li,et al.  Constant approximation for k-median and k-means with outliers via iterative rounding , 2017, STOC.

[12]  Cecilia,et al.  Are Emily and Greg More Employable Than Lakisha and Jamal ? A Field Experiment on Labor Market Discrimination , 2007 .

[13]  J. Zwanziger,et al.  Is travel distance a barrier to veterans' use of VA hospitals for medical surgical care? , 2000, Social science & medicine.

[14]  K A Schulman,et al.  The effect of race and sex on physicians' recommendations for cardiac catheterization. , 1999, The New England journal of medicine.

[15]  Aravind Srinivasan,et al.  An Improved Approximation for k-Median and Positive Correlation in Budgeted Optimization , 2014, SODA.

[16]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[17]  Aravind Srinivasan,et al.  Dependent rounding for knapsack/partition constraints and facility location , 2017 .

[18]  Dana Moshkovitz The Projection Games Conjecture and the NP-Hardness of ln n-Approximating Set-Cover , 2015, Theory Comput..

[19]  David M. Mount,et al.  A local search approximation algorithm for k-means clustering , 2002, SCG '02.

[20]  Marcia Valenstein,et al.  Veterans Affairs Health System and mental health treatment retention among patients with serious mental illness: evaluating accessibility and availability barriers. , 2007, Health services research.

[21]  Roshan Bastani,et al.  A randomized clinical trial to assess the benefit of offering on-site mobile mammography in addition to health education for older women. , 2002, AJR. American journal of roentgenology.

[22]  Toniann Pitassi,et al.  Fairness through awareness , 2011, ITCS '12.

[23]  Shi Li,et al.  A Dependent LP-Rounding Approach for the k-Median Problem , 2012, ICALP.

[24]  Carlos Eduardo Scheidegger,et al.  Certifying and Removing Disparate Impact , 2014, KDD.

[25]  Ola Svensson,et al.  Better Guarantees for k-Means and Euclidean k-Median by Primal-Dual Algorithms , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[26]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[27]  Philip N. Klein,et al.  Local Search Yields Approximation Schemes for k-Means and k-Median in Euclidean and Minor-Free Metrics , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).

[28]  Aravind Srinivasan,et al.  A Lottery Model for Center-Type Problems With Outliers , 2017, APPROX-RANDOM.

[29]  Amin Saberi,et al.  A new greedy approach for facility location problems , 2002, STOC '02.

[30]  David B. Shmoys,et al.  A unified approach to approximation algorithms for bottleneck problems , 1986, JACM.

[31]  Sharareh Alipour,et al.  Improvements on the k-center Problem for Uncertain Data , 2017, PODS.

[32]  Maurice L. Druzin,et al.  Use of a Community Mobile Health Van to Increase Early Access to Prenatal Care , 2007, Maternal and Child Health Journal.

[33]  Samir Khuller,et al.  Greedy strikes back: improved facility location algorithms , 1998, SODA '98.

[34]  Roman Glebov,et al.  On the Concentration of the Domination Number of the Random Graph , 2012, SIAM J. Discret. Math..

[35]  J. Piette,et al.  The influence of distance on utilization of outpatient mental health aftercare following inpatient substance abuse treatment. , 2003, Addictive behaviors.