暂无分享,去创建一个
Michael I. Jordan | Niki Kilbertus | Karl Krauth | Jiri Hron | Niki Kilbertus | K. Krauth | Jiri Hron
[1] Michael I. Jordan,et al. Learning from eXtreme Bandit Feedback , 2021, AAAI.
[2] Ed H. Chi,et al. Off-policy Learning in Two-stage Recommender Systems , 2020, WWW.
[3] David M. Blei,et al. Scalable Recommendation with Hierarchical Poisson Factorization , 2015, UAI.
[4] Thorsten Joachims,et al. Unbiased Learning-to-Rank with Biased Feedback , 2016, WSDM.
[5] Sampath Kannan,et al. A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem , 2018, NeurIPS.
[6] Maurizio Morisio,et al. Hybrid recommender systems: A systematic literature review , 2019, Intell. Data Anal..
[7] John Riedl,et al. Recommender Systems for Large-scale E-Commerce : Scalable Neighborhood Formation Using Clustering , 2002 .
[8] K. Jarrod Millman,et al. Array programming with NumPy , 2020, Nat..
[9] S. Levine,et al. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems , 2020, ArXiv.
[10] Julian Zimmert,et al. Adapting to Misspecification in Contextual Bandits , 2021, NeurIPS.
[11] Li Wei,et al. Sampling-bias-corrected neural modeling for large corpus item recommendations , 2019, RecSys.
[12] David Simchi-Levi,et al. Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability , 2020, SSRN Electronic Journal.
[13] John N. Tsitsiklis,et al. Linearly Parameterized Bandits , 2008, Math. Oper. Res..
[14] Michael I. Jordan,et al. Exploration in two-stage recommender systems , 2020, ArXiv.
[15] Xiangnan He,et al. Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention , 2017, SIGIR.
[16] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[17] Khashayar Khosravi,et al. Mostly Exploration-Free Algorithms for Contextual Bandits , 2017, Manag. Sci..
[18] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[19] John D. Hunter,et al. Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.
[20] Alexander Rakhlin,et al. Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles , 2020, ICML.
[21] John Langford,et al. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits , 2014, ICML.
[22] Yehuda Koren,et al. On the Difficulty of Evaluating Baselines: A Study on Recommender Systems , 2019, ArXiv.
[23] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[24] Jimmy J. Lin,et al. Fast candidate generation for two-phase document ranking: postings list intersection with bloom filters , 2012, CIKM.
[25] Jöran Beel,et al. A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems , 2015, TPDL.
[26] Julian McAuley,et al. Candidate Generation with Binary Codes for Large-Scale Top-N Recommendation , 2019, CIKM.
[27] Wes McKinney,et al. Data Structures for Statistical Computing in Python , 2010, SciPy.
[28] Claudio Gentile,et al. On multilabel classification and ranking with bandit feedback , 2014, J. Mach. Learn. Res..
[29] Parul Parashar,et al. Neural Networks in Machine Learning , 2014 .
[30] Bo Zhao,et al. CaSMoS: A Framework for Learning Candidate Selection Models over Structured Queries and Documents , 2016, KDD.
[31] Paul Covington,et al. Deep Neural Networks for YouTube Recommendations , 2016, RecSys.
[32] Philip M. Long,et al. Associative Reinforcement Learning using Linear Probabilistic Concepts , 1999, ICML.
[33] Thorsten Joachims,et al. Estimating Position Bias without Intrusive Interventions , 2018, WSDM.
[34] Thomas P. Hayes,et al. Stochastic Linear Optimization under Bandit Feedback , 2008, COLT.
[35] Domonkos Tikk,et al. Scalable Collaborative Filtering Approaches for Large Recommender Systems , 2009, J. Mach. Learn. Res..
[36] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[37] Michael L. Waskom,et al. Seaborn: Statistical Data Visualization , 2021, J. Open Source Softw..
[38] et al.,et al. Jupyter Notebooks - a publishing format for reproducible computational workflows , 2016, ELPUB.
[39] Susan Athey,et al. Tractable contextual bandits beyond realizability , 2020, AISTATS.
[40] Manfred K. Warmuth,et al. The Weighted Majority Algorithm , 1994, Inf. Comput..
[41] John Langford,et al. Off-policy evaluation for slate recommendation , 2016, NIPS.
[42] Benjamin Recht,et al. Recommendations and user agency: the reachability of collaboratively-filtered information , 2020, FAT*.
[43] Ed H. Chi,et al. Top-K Off-Policy Correction for a REINFORCE Recommender System , 2018, WSDM.
[44] Carl E. Rasmussen,et al. Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.
[45] Andreas Krause,et al. Learning to Interact With Learning Agents , 2018, AAAI.
[46] Craig Boutilier,et al. SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets , 2019, IJCAI.
[47] Mariarosaria Taddeo,et al. Recommender systems and their ethical challenges , 2020, AI & SOCIETY.
[48] Thorsten Joachims,et al. Recommendations as Treatments: Debiasing Learning and Evaluation , 2016, ICML.
[49] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[50] Ruslan Salakhutdinov,et al. Probabilistic Matrix Factorization , 2007, NIPS.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.
[53] Ruosong Wang,et al. Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? , 2020, ICLR.
[54] Haipeng Luo,et al. Practical Contextual Bandits with Regression Oracles , 2018, ICML.
[55] Tor Lattimore,et al. Learning with Good Feature Representations in Bandits and in RL with a Generative Model , 2020, ICML.
[56] Jonathan L. Herlocker,et al. Evaluating collaborative filtering recommender systems , 2004, TOIS.
[57] Joseph N. Wilson,et al. Twenty Years of Mixture of Experts , 2012, IEEE Transactions on Neural Networks and Learning Systems.
[58] Derek Bridge,et al. Diversity, Serendipity, Novelty, and Coverage , 2016, ACM Trans. Interact. Intell. Syst..
[59] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[60] Jure Leskovec,et al. Hidden factors and hidden topics: understanding rating dimensions with review text , 2013, RecSys.
[61] Kun Gai,et al. Learning Tree-based Deep Model for Recommender Systems , 2018, KDD.
[62] Michael I. Jordan,et al. Hierarchies of Adaptive Experts , 1991, NIPS.
[63] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[64] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[65] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[66] L. Breiman. Arcing Classifiers , 1998 .
[67] Koby Crammer,et al. Multiclass classification with bandit feedback using adaptive regularization , 2012, Machine Learning.
[68] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[69] J. Friedman. Stochastic gradient boosting , 2002 .
[70] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[71] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[72] John Langford,et al. A Contextual Bandit Bake-off , 2018, J. Mach. Learn. Res..
[73] Sreeram Kannan,et al. Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms , 2018, ICML.
[74] Haipeng Luo,et al. Corralling a Band of Bandit Algorithms , 2016, COLT.
[75] Jure Leskovec,et al. Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time , 2017, WWW.
[76] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[77] Li Wei,et al. Recommending what video to watch next: a multitask ranking system , 2019, RecSys.
[78] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..
[79] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[80] Noam Shazeer,et al. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity , 2021, ArXiv.
[81] Steffen Rendle,et al. Factorization Machines , 2010, 2010 IEEE International Conference on Data Mining.
[82] Fabio Stella,et al. Contrasting Offline and Online Results when Evaluating Recommendation Algorithms , 2016, RecSys.
[83] Tat-Seng Chua,et al. Neural Collaborative Filtering , 2017, WWW.
[84] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[85] Aditya Gopalan,et al. Misspecified Linear Bandits , 2017, AAAI.
[86] Michael I. Jordan,et al. Do Offline Metrics Predict Online Performance in Recommender Systems? , 2020, ArXiv.
[87] Will Dabney,et al. Adaptive Trade-Offs in Off-Policy Learning , 2020, AISTATS.
[88] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.
[89] Moritz Hardt,et al. From Optimizing Engagement to Measuring Value , 2020, FAccT.
[90] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.
[91] Wei Li,et al. Multi-Interest Network with Dynamic Routing for Recommendation at Tmall , 2019, CIKM.
[92] Xu Sun,et al. Coarse-grained Candidate Generation and Fine-grained Re-ranking for Chinese Abbreviation Prediction , 2014, EMNLP.
[93] John Langford,et al. Efficient Optimal Learning for Contextual Bandits , 2011, UAI.
[94] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[95] F. Maxwell Harper,et al. The MovieLens Datasets: History and Context , 2016, TIIS.