论文信息 - Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

Modeling Recommender Ecosystems: Research Challenges at the Intersection of Mechanism Design, Reinforcement Learning and Generative Models

Modern recommender systems lie at the heart of complex ecosystems that couple the behavior of users, content providers, advertisers, and other actors. Despite this, the focus of the majority of recommender research -- and most practical recommenders of any import -- is on the local, myopic optimization of the recommendations made to individual users. This comes at a significant cost to the long-term utility that recommenders could generate for its users. We argue that explicitly modeling the incentives and behaviors of all actors in the system -- and the interactions among them induced by the recommender's policy -- is strictly necessary if one is to maximize the value the system brings to these actors and improve overall ecosystem"health". Doing so requires: optimization over long horizons using techniques such as reinforcement learning; making inevitable tradeoffs in the utility that can be generated for different actors using the methods of social choice; reducing information asymmetry, while accounting for incentives and strategic behavior, using the tools of mechanism design; better modeling of both user and item-provider behaviors by incorporating notions from behavioral economics and psychology; and exploiting recent advances in generative and foundation models to make these mechanisms interpretable and actionable. We propose a conceptual framework that encompasses these elements, and articulate a number of research challenges that emerge at the intersection of these different disciplines.

Craig Boutilier | Martin Mladenov | Guy Tennenholtz

[1] Xinyi Gao,et al. Multi-Turn Dialogue Agent as Sales' Assistant in Telemarketing , 2023, 2023 International Joint Conference on Neural Networks (IJCNN).

[2] Hakim Sidahmed,et al. Leveraging Large Language Models in Conversational Recommender Systems , 2023, ArXiv.

[3] Aldo G. Carranza,et al. Privacy-Preserving Recommender Systems with Synthetic Query Generation using Differentially Private Large Language Models , 2023, ArXiv.

[4] Bary S. R. Pradelski,et al. Statistical Discrimination in Stable Matchings , 2022, EC.

[5] Moshe Tennenholtz,et al. Competitive Search , 2022, SIGIR.

[6] Michael I. Jordan,et al. Modeling Content Creator Incentives on Algorithm-Curated Platforms , 2022, ICLR.

[7] J. Steinhardt,et al. Supply-Side Equilibria in Recommender Systems , 2022, ArXiv.

[8] Yingqiang Ge,et al. Fairness in Recommendation: Foundations, Methods, and Applications , 2022, ACM Trans. Intell. Syst. Technol..

[9] M. de Rijke,et al. Fairness of Exposure in Light of Incomplete Exposure Estimation , 2022, SIGIR.

[10] M. Kolar,et al. Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning , 2022, ICML.

[11] Le Fang,et al. Differentially private recommender system with variational autoencoders , 2022, Knowl. Based Syst..

[12] Bhaskar Mitra,et al. Joint Multisided Exposure Fairness for Recommendation , 2022, SIGIR.

[13] Preetam Nandy,et al. Long-term Dynamics of Fairness Intervention in Connection Recommender Systems , 2022, AIES.

[14] Zachary Chase Lipton,et al. Modeling Attrition in Recommender Systems with Departing Bandits , 2022, AAAI.

[15] S. Mandt,et al. Diffusion Probabilistic Modeling for Video Generation , 2022, Entropy.

[16] Michael I. Jordan,et al. Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach , 2022, ArXiv.

[17] Renelito Delos Santos,et al. LaMDA: Language Models for Dialog Applications , 2022, ArXiv.

[18] Prateek Jain,et al. Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates , 2021, ICML.

[19] Michael I. Jordan,et al. The Stereotyping Problem in Collaboratively Filtered Recommender Systems , 2021, EAAMO.

[20] Konstantina Christakopoulou,et al. Towards Content Provider Aware Recommender Systems: A Simulation Study on the Interplay between User and Provider Utilities , 2021, WWW.

[21] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.

[22] B. Ommer,et al. Taming Transformers for High-Resolution Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Sven Seuken,et al. iMLCA: Machine Learning-powered Iterative Combinatorial Auctions with Interval Bidding , 2020, EC.

[24] Craig Boutilier,et al. Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach , 2020, ICML.

[25] Kinjal Basu,et al. A Framework for Fairness in Two-Sided Marketplaces , 2020, ArXiv.

[26] Ed H. Chi,et al. Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations , 2020, WWW.

[27] Haris Vikalo,et al. Federating Recommendations Using Differentially Private Prototypes , 2020, Pattern Recognit..

[28] Craig Boutilier,et al. Preference elicitation and robust winner determination for single- and multi-winner social choice , 2020, Artif. Intell..

[29] Dietmar Jannach,et al. Multistakeholder recommendation: Survey and research directions , 2020, User Modeling and User-Adapted Interaction.

[30] Moshe Tennenholtz,et al. Rethinking search engines and recommendation systems , 2019, Commun. ACM.

[31] Li Wei,et al. Sampling-bias-corrected neural modeling for large corpus item recommendations , 2019, RecSys.

[32] Virgílio A. F. Almeida,et al. Auditing radicalization pathways on YouTube , 2019, FAT*.

[33] Craig Boutilier,et al. SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets , 2019, IJCAI.

[34] Robin Burke,et al. The Unfairness of Popularity Bias in Recommendation , 2019, RMSE@RecSys.

[35] Michael I. Jordan,et al. Competing Bandits in Matching Markets , 2019, AISTATS.

[36] Douglas Eck,et al. Learning to Groove with Inverse Sequence Transformations , 2019, ICML.

[37] Guy Aridor,et al. Deconstructing the Filter Bubble: User Decision-Making and Recommender Systems , 2019, RecSys.

[38] Leonidas Spiliopoulos,et al. Optimal User Choice Engineering in Mobile Crowdsensing with Bounded Rational Users , 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications.

[39] Himan Abdollahpouri,et al. Popularity Bias in Ranking and Recommendation , 2019, AIES.

[40] Ed H. Chi,et al. Top-K Off-Policy Correction for a REINFORCE Recommender System , 2018, WSDM.

[41] David I. Laibson,et al. Intertemporal Choice , 2018, Handbook of Behavioral Economics - Foundations and Applications 2.

[42] Xiaohui Ye,et al. Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.

[43] Fernando Diaz,et al. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems , 2018, CIKM.

[44] Moshe Tennenholtz,et al. From Recommendation Systems to Facility Location Games , 2018, AAAI.

[45] Yi Zhang,et al. Conversational Recommender System , 2018, SIGIR.

[46] Moshe Tennenholtz,et al. A Game-Theoretic Approach to Recommendation Systems with Strategic Content Providers , 2018, NeurIPS.

[47] Krishna P. Gummadi,et al. Equity of Attention: Amortizing Individual Fairness in Rankings , 2018, SIGIR.

[48] Thorsten Joachims,et al. Fairness of Exposure in Rankings , 2018, KDD.

[49] Charalampos E. Tsourakakis,et al. Opinion Dynamics with Varying Susceptibility to Persuasion , 2018, KDD.

[50] Jung-Woo Ha,et al. Reinforcement Learning based Recommender System using Biclustering Technique , 2018, ArXiv.

[51] Abolfazl Asudeh,et al. Designing Fair Ranking Schemes , 2017, SIGMOD Conference.

[52] Charalampos E. Tsourakakis,et al. Minimizing Polarization and Disagreement in Social Networks , 2017, WWW.

[53] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.

[54] D. Bergemann,et al. Dynamic Mechanism Design: An Introduction , 2017 .

[55] Nisheeth K. Vishnoi,et al. Fair Personalization , 2017, ArXiv.

[56] Dan Levin,et al. Bounded Rationality and Robust Mechanism Design: An Axiomatic Approach , 2017 .

[57] Ambuj K. Singh,et al. Polar Opinion Dynamics in Social Networks , 2017, IEEE Transactions on Automatic Control.

[58] Paul Covington,et al. Deep Neural Networks for YouTube Recommendations , 2016, RecSys.

[59] Brian Whitman,et al. Music Personalization at Spotify , 2016, RecSys.

[60] Filip Radlinski,et al. Towards Conversational Recommender Systems , 2016, KDD.

[61] Marco Pavone,et al. Risk-Constrained Reinforcement Learning with Percentile Risk Criteria , 2015, J. Mach. Learn. Res..

[62] Chong Wang,et al. Revenue-Optimized Webpage Recommendation , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[63] Diane Tang,et al. Focusing on the Long-term: It's Good for Users and Business , 2015, KDD.

[64] Michael C. Fu,et al. Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control , 2015, ICML.

[65] Alexander J. Smola,et al. Fast Differentially Private Matrix Factorization , 2015, RecSys.

[66] Johannes Horner,et al. Dynamic Mechanisms Without Money , 2015 .

[67] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68] Vasudeva Varma,et al. Computational Advertising: Techniques for Targeting Relevant Ads , 2014 .

[69] Liang Tang,et al. Ensemble contextual bandits for personalized recommendation , 2014, RecSys '14.

[70] Klaus Obermayer,et al. Risk-Sensitive Reinforcement Learning , 2013, Neural Computation.

[71] Ilya Segal,et al. An Efficient Dynamic Mechanism , 2013 .

[72] Aranyak Mehta,et al. Online Matching and Ad Allocation , 2013, Found. Trends Theor. Comput. Sci..

[73] Alex Imas,et al. Experimental methods: Eliciting risk preferences , 2013 .

[74] Qinghua Zheng,et al. A Survey of Faceted Search , 2013, J. Web Eng..

[75] John Riedl,et al. The Tag Genome: Encoding Community Knowledge to Support Novel Interaction , 2012, TIIS.

[76] Thorsten Joachims,et al. The K-armed Dueling Bandits Problem , 2012, COLT.

[77] Craig Boutilier,et al. Bayesian Vote Manipulation: Optimal Strategies and Impact on Welfare , 2012, UAI.

[78] Eli Pariser,et al. The Filter Bubble: How the New Personalized Web Is Changing What We Read and How We Think , 2012 .

[79] Li Chen,et al. Critiquing-based recommenders: survey and emerging trends , 2012, User Modeling and User-Adapted Interaction.

[80] Craig Boutilier,et al. Eliciting Additive Reward Functions for Markov Decision Processes , 2011, IJCAI.

[81] Toniann Pitassi,et al. Fairness through awareness , 2011, ITCS '12.

[82] Caleb Warren,et al. Values and Preferences: Defining Preference Construction , 2011, Wiley interdisciplinary reviews. Cognitive science.

[83] Cornelia Schön,et al. On the Optimal Product Line Selection Problem with Price Discrimination , 2010, Manag. Sci..

[84] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[85] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[86] Moshe Tennenholtz,et al. Approximate mechanism design without money , 2009, EC '09.

[87] Li Chen,et al. User-Involved Preference Elicitation for Product Search and Recommender Systems , 2008, AI Mag..

[88] Yifan Hu,et al. Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[89] D. Bergemann,et al. The Dynamic Pivot Mechanism , 2008 .

[90] Yi Zhang,et al. Personalized interactive faceted search , 2008, WWW.

[91] Ariel D. Procaccia,et al. Incentive compatible regression learning , 2008, SODA '08.

[92] Ruslan Salakhutdinov,et al. Probabilistic Matrix Factorization , 2007, NIPS.

[93] Saeed Shiry Ghidary,et al. Usage-based web recommendations: a reinforcement learning approach , 2007, RecSys '07.

[94] N. Nisan,et al. Algorithmic Game Theory , 2007 .

[95] Stan Lipovetsky,et al. Designing Economic Mechanisms , 2007, Technometrics.

[96] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[97] Craig Boutilier,et al. Mechanism Design with Partial Revelation , 2007, IJCAI.

[98] Vincent Conitzer,et al. Automated Design of Multistage Mechanisms , 2007, IJCAI.

[99] Paolo Viappiani,et al. Preference-based Search using Example-Critiquing with Suggestions , 2006, J. Artif. Intell. Res..

[100] Craig Boutilier,et al. Constraint-based optimization and utility elicitation using the minimax decision criterion , 2006, Artif. Intell..

[101] Maria-Florina Balcan,et al. Mechanism design via machine learning , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).

[102] Avrim Blum,et al. Preference Elicitation and Query Learning , 2004, J. Mach. Learn. Res..

[103] David C. Parkes,et al. Applying learning algorithms to preference elicitation , 2004, EC '04.

[104] Olivier Toubia,et al. Polyhedral Methods for Adaptive Choice-Based Conjoint Analysis , 2004 .

[105] Robin D. Burke,et al. Interactive Critiquing forCatalog Navigation in E-Commerce , 2002, Artificial Intelligence Review.

[106] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[107] Craig Boutilier,et al. A POMDP formulation of preference elicitation problems , 2002, AAAI/IAAI.

[108] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[109] Sean M. McNee,et al. Getting to know you: learning new user preferences in recommender systems , 2002, IUI '02.

[110] Raimo P. Hämäläinen,et al. Preference ratios in multiattribute evaluation (PRIME)-elicitation and decision procedures under incomplete information , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[111] Tuomas Sandholm,et al. Preference elicitation in combinatorial auctions , 2001, AAMAS '02.

[112] A. Tversky,et al. Choices, Values, and Frames , 2000 .

[113] David C. Parkes,et al. iBundle: an efficient ascending price bundle auction , 1999, EC '99.

[114] Satish Rao,et al. Approximation schemes for Euclidean k-medians and related problems , 1998, STOC '98.

[115] David I. Laibson,et al. Golden Eggs and Hyperbolic Discounting , 1997 .

[116] Bradley N. Miller,et al. GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[117] P. Slovic. The Construction of Preference , 1995 .

[118] Ralph L. Keeney,et al. Value-Focused Thinking: A Path to Creative Decisionmaking , 1992 .

[119] R. Kohli,et al. Heuristics for Product-Line Design Using Conjoint Analysis , 1990 .

[120] Paul E. Green,et al. Conjoint Analysis in Marketing: New Developments with Implications for Research and Practice , 1990 .

[121] Ross D. Shachter. Evaluating Influence Diagrams , 1986, Oper. Res..

[122] Andrew P. Sage,et al. A model of multiattribute decisionmaking and trade-off weight determination under uncertainty , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[123] R. Myerson. Mechanism Design by an Informed Principal , 1983 .

[124] L. Hurwicz. Studies in Resource Allocation Processes: Optimality and informational efficiency in resource allocation processes , 1977 .

[125] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[126] M. Satterthwaite. Strategy-proofness and Arrow's conditions: Existence and correspondence theorems for voting procedures and social welfare functions , 1975 .

[127] A. Tversky,et al. Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[128] M. Degroot. Reaching a Consensus , 1974 .

[129] A. Gibbard. Manipulation of Voting Schemes: A General Result , 1973 .

[130] A. Sen,et al. The Impossibility of a Paretian Liberal , 1970, Journal of Political Economy.

[131] John C. Harsanyi,et al. Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparisons of Utility , 1955, Journal of Political Economy.

[132] K. Arrow. A Difficulty in the Concept of Social Welfare , 1950, Journal of Political Economy.

[133] Michael D. Ekstrand,et al. Fairness in Recommender Systems , 2022, Recommender Systems Handbook.

[134] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[135] Peng Sun,et al. Multiagent Mechanism Design Without Money , 2019, Oper. Res..

[136] Liangjie Hong,et al. Joint Optimization of Profit and Relevance for Recommendation Systems in E-commerce , 2019, RMSE@RecSys.

[137] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[138] Konstantina Christakopoulou,et al. Learning to Interact with Users: A Collaborative-Bandit Approach , 2018, SDM.

[139] A. Narayanan,et al. Fairness and Machine Learning Limitations and Opportunities , 2018 .

[140] Johannes Fürnkranz,et al. A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..

[141] Ulrike Goldschmidt. Advances In Behavioral Economics , 2016 .

[142] Christos Tzamos,et al. Mechanism Design without Money , 2012 .

[143] Craig Boutilier,et al. Computational Decision Support: Regret-based Models for Optimization and Preference Elicitation , 2012 .

[144] Yehuda Koren,et al. Advances in Collaborative Filtering , 2011, Recommender Systems Handbook.

[145] L. Buşoniu,et al. Multi-agent Reinforcement Learning: An Overview , 2010 .

[146] Òscar Celma,et al. The Long Tail in Recommender Systems , 2010 .

[147] Michel Gendreau,et al. Combinatorial auctions , 2007, Ann. Oper. Res..

[148] A. Tversky,et al. Prospect Theory : An Analysis of Decision under Risk , 1979 .

[149] Garrett J. van Ryzin,et al. Revenue Management Under a General Discrete Choice Model of Consumer Behavior , 2004, Manag. Sci..

[150] Albert T. Corbett,et al. User Modeling and User-Adapted Interaction , 2004 .

[151] J. Avery,et al. The long tail. , 1995, Journal of the Tennessee Medical Association.

[152] A. Mas-Colell,et al. Microeconomic Theory , 1995 .

[153] R. McAfee,et al. Auctions and Bidding , 1986 .