Hypergraph Clustering for Finding Diverse and Experienced Groups

When forming a team or group of individuals, we often seek a balance of expertise in a particular task while at the same time maintaining diversity of skills within each group. Here, we view the problem of finding diverse and experienced groups as clustering in hypergraphs with multiple edge types. The input data is a hypergraph with multiple hyperedge types -- representing information about past experiences of groups of individuals -- and the output is groups of nodes. In contrast to related problems on fair or balanced clustering, we model diversity in terms of variety of past experience (instead of, e.g., protected attributes), with a goal of forming groups that have both experience and diversity with respect to participation in edge types. In other words, both diversity and experience are measured from the types of the hyperedges. Our clustering model is based on a regularized version of an edge-based hypergraph clustering objective, and we also show how naive objectives actually have no diversity-experience tradeoff. Although our objective function is NP-hard to optimize, we design an efficient 2-approximation algorithm and also show how to compute bounds for the regularization hyperparameter that lead to meaningful diversity-experience tradeoffs. We demonstrate an application of this framework in online review platforms, where the goal is to curate sets of user reviews for a product type. In this context, "experience" corresponds to users familiar with the type of product, and "diversity" to users that have reviewed related products.

[1]  Charalampos E. Tsourakakis,et al.  Chromatic Correlation Clustering , 2015, TKDD.

[2]  Austin R. Benson,et al.  Hypergraph Cuts with General Splitting Functions , 2020, SIAM Rev..

[3]  Marian N. Ruderman,et al.  Diversity in work teams: Research paradigms for a changing workplace. , 1995 .

[4]  Avrim Blum,et al.  Correlation Clustering , 2004, Machine Learning.

[5]  Ricardo Baeza-Yates,et al.  FA*IR: A Fair Top-k Ranking Algorithm , 2017, CIKM.

[6]  Olgica Milenkovic,et al.  Inhomogeneous Hypergraph Clustering with Applications , 2017, NIPS.

[7]  Silvio Lattanzi,et al.  Fair Clustering Through Fairlets , 2018, NIPS.

[8]  Jakub W. Pachocki,et al.  Scalable Motif-aware Graph Clustering , 2016, WWW.

[9]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[10]  Sebastian Nowozin,et al.  Solution stability in linear programming relaxations: graph partitioning and unsupervised learning , 2009, ICML '09.

[11]  Scott W. Hadley,et al.  Approximation Techniques for Hypergraph Partitioning Problems , 1995, Discret. Appl. Math..

[12]  Sara Ahmadian,et al.  Fair Correlation Clustering , 2020, AISTATS.

[13]  Daniel Levi,et al.  Group Dynamics for Teams , 2020 .

[14]  Sharad Goel,et al.  The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning , 2018, ArXiv.

[15]  Eugene L. Lawler,et al.  Cutsets and partitions of hypergraphs , 1973, Networks.

[16]  Pranjal Awasthi,et al.  Guarantees for Spectral Clustering with Fairness Constraints , 2019, ICML.

[17]  Kristina Lerman,et al.  A Survey on Bias and Fairness in Machine Learning , 2019, ACM Comput. Surv..

[18]  Haris Aziz,et al.  A Rule for Committee Selection with Soft Diversity Constraints , 2018, Group Decision and Negotiation.

[19]  Stephen E. Humphrey,et al.  Facing differences with an open mind: Openness to experience, salience of intra-group differences, and performance of diverse work groups , 2008 .

[20]  Austin R. Benson,et al.  Clustering in graphs and hypergraphs with categorical edge labels , 2020, WWW.

[21]  David F. Gleich,et al.  Graph Clustering in All Parameter Regimes , 2019, MFCS.

[22]  Sara Ahmadian,et al.  Clustering without Over-Representation , 2019, KDD.

[23]  Dorothea Wagner,et al.  Modeling Hypergraphs by Graphs with the Same Mincut Properties , 1993, Inf. Process. Lett..

[24]  Yi-Cheng Zhang,et al.  Solving the apparent diversity-accuracy dilemma of recommender systems , 2008, Proceedings of the National Academy of Sciences.

[25]  Renato D. C. Monteiro,et al.  A geometric view of parametric linear programming , 1992, Algorithmica.

[26]  Laming Chen,et al.  Fast Greedy MAP Inference for Determinantal Point Process to Improve Recommendation Diversity , 2017, NeurIPS.

[27]  Nisheeth K. Vishnoi,et al.  Multiwinner Voting with Fairness Constraints , 2017, IJCAI.

[28]  Erez Shmueli,et al.  Algorithmic Fairness , 2020, ArXiv.

[29]  Saul Vargas,et al.  Rank and relevance in novelty and diversity metrics for recommender systems , 2011, RecSys '11.

[30]  Evaggelia Pitoura,et al.  Fair sequential group recommendations , 2020, SAC.

[31]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[32]  S. S. Ravi,et al.  Making Existing Clusterings Fairer: Algorithms, Complexity Results and Insights , 2020, AAAI.

[33]  Jianmo Ni,et al.  Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects , 2019, EMNLP.

[34]  Kostas Stefanidis,et al.  Fair Team Recommendations for Multidisciplinary Projects , 2019, 2019 IEEE/WIC/ACM International Conference on Web Intelligence (WI).

[35]  Saul Vargas,et al.  Novelty and Diversity in Recommender Systems , 2015, Recommender Systems Handbook.

[36]  Andrew D. Selbst,et al.  Big Data's Disparate Impact , 2016 .

[37]  Craig MacDonald,et al.  Exploiting query reformulations for web search result diversification , 2010, WWW '10.

[38]  Jon M. Kleinberg,et al.  Team Performance with Test Scores , 2015, EC.

[39]  Kamesh Munagala,et al.  Proportionally Fair Clustering , 2019, ICML.