Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints
暂无分享,去创建一个
[1] S. Darak,et al. Multiarmed Bandit Algorithms on Zynq System-on-Chip: Go Frequentist or Bayesian? , 2022, IEEE Transactions on Neural Networks and Learning Systems.
[2] Zulong Chen,et al. SAR-Net: A Scenario-Aware Ranking Network for Personalized Fair Recommendation in Hundreds of Travel Scenarios , 2021, CIKM.
[3] Xiangnan He,et al. Causal Incremental Graph Convolution for Recommender System Retraining , 2021, IEEE transactions on neural networks and learning systems.
[4] Mingsheng Shang,et al. An L1-and-L2-Norm-Oriented Latent Factor Model for Recommender Systems , 2021, IEEE Transactions on Neural Networks and Learning Systems.
[5] Sanjay Misra,et al. A Teaching-Learning-Based Optimization Algorithm for the Weighted Set-Covering Problem , 2020, Tehnicki vjesnik - Technical Gazette.
[6] Junning Liu,et al. Progressive Layered Extraction (PLE): A Novel Multi-Task Learning (MTL) Model for Personalized Recommendations , 2020, RecSys.
[7] Csaba Szepesvari,et al. Bandit Algorithms , 2020 .
[8] Safia Kedad-Sidhoum,et al. Reinforcement Learning for Variable Selection in a Branch and Bound Algorithm , 2020, CPAIOR.
[9] M. de Rijke,et al. Cascading Hybrid Bandits: Online Learning to Rank for Relevance and Diversity , 2019, RecSys.
[10] Yujing Hu,et al. Multi-Agent Game Abstraction via Graph Attention Neural Network , 2019, AAAI.
[11] Quanquan Gu,et al. Neural Contextual Bandits with UCB-based Exploration , 2019, ICML.
[12] O. Cappé,et al. Weighted Linear Bandits for Non-Stationary Environments , 2019, NeurIPS.
[13] A. Rajkumar,et al. Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback , 2019, NeurIPS.
[14] Masahiro Ono,et al. Co-training for Policy Learning , 2019, UAI.
[15] Guangquan Zhang,et al. A Cross-Domain Recommender System With Kernel-Induced Knowledge Transfer for Overlapping Entities , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[16] Andrea Lodi,et al. Exact Combinatorial Optimization with Graph Convolutional Neural Networks , 2019, NeurIPS.
[17] Y. Mansour,et al. Top-$k$ Combinatorial Bandits with Full-Bandit Feedback , 2019, ALT.
[18] Yu Gong,et al. Exact-K Recommendation via Maximal Clique Optimization , 2019, KDD.
[19] Max Welling,et al. Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement , 2019, ICML.
[20] Masashi Sugiyama,et al. Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback , 2019, Neural Computation.
[21] Julian Zimmert,et al. Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously , 2019, ICML.
[22] Jasper Snoek,et al. DPPNet: Approximating Determinantal Point Processes with Deep Networks , 2019, NeurIPS.
[23] Ed H. Chi,et al. Top-K Off-Policy Correction for a REINFORCE Recommender System , 2018, WSDM.
[24] Vaneet Aggarwal,et al. Regret Bounds for Stochastic Combinatorial Multi-Armed Bandits with Linear Space Complexity , 2018, ArXiv.
[25] Zhe Zhao,et al. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.
[26] Wei Zhang,et al. Master-Slave Curriculum Design for Reinforcement Learning , 2018, IJCAI.
[27] Gregory Ditzler,et al. A Sequential Learning Approach for Scaling Up Filter-Based Feature Subset Selection , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[28] Shie Mannor,et al. Reward Constrained Policy Optimization , 2018, ICLR.
[29] Nicholas Jing Yuan,et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation , 2018, WWW.
[30] Maria-Florina Balcan,et al. Learning to Branch , 2018, ICML.
[31] Yizhou Wang,et al. Revisiting the Master-Slave Architecture in Multi-Agent Deep Reinforcement Learning , 2017, ArXiv.
[32] Guy Van den Broeck,et al. A Semantic Loss Function for Deep Learning with Symbolic Knowledge , 2017, ICML.
[33] Y. Hoogendoorn. The Maximum Coverage Problem , 2017 .
[34] Yunming Ye,et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction , 2017, IJCAI.
[35] Samy Bengio,et al. Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.
[36] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[37] Dong Yu,et al. Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features , 2016, KDD.
[38] Heng-Tze Cheng,et al. Wide & Deep Learning for Recommender Systems , 2016, DLRS@RecSys.
[39] MengChu Zhou,et al. A Nonnegative Latent Factor Model for Large-Scale Sparse Matrices in Recommender Systems via Alternating Direction Method , 2016, IEEE Transactions on Neural Networks and Learning Systems.
[40] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[41] Wei Cao,et al. On Top-k Selection in Multi-Armed Bandits and Hidden Bipartite Graphs , 2015, NIPS.
[42] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[43] Zheng Wen,et al. Cascading Bandits: Learning to Rank in the Cascade Model , 2015, ICML.
[44] Wei Chen,et al. Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications , 2014, ICML.
[45] Claudio Gentile,et al. A Gang of Bandits , 2013, NIPS.
[46] Sham M. Kakade,et al. Towards Minimax Policies for Online Linear Optimization with Bandit Feedback , 2012, COLT.
[47] Aditya G. Parameswaran,et al. Recommendation systems with complex constraints: A course recommendation perspective , 2011, TOIS.
[48] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[49] Enrique Vidal,et al. Computation of Normalized Edit Distance and Applications , 1993, IEEE Trans. Pattern Anal. Mach. Intell..
[50] Junchi Yan,et al. Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case , 2023, ICLR.
[51] Zheng Wen,et al. Cascading Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank , 2019, UAI.
[52] Baochun Li,et al. Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization , 2018, NeurIPS.