Fraud Regulating Policy for E-Commerce via Constrained Contextual Bandits
暂无分享,去创建一个
Jie Zhang | Zehong Hu | Zhen Wang | Zhao Li | Shichang Hu | Shasha Ruan
[1] Csaba Szepesvári,et al. Improved Algorithms for Linear Stochastic Bandits , 2011, NIPS.
[2] Pingzhong Tang,et al. Ranking Mechanism Design for Price-setting Agents in E-commerce , 2018, AAMAS.
[3] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[4] Qiang Wu,et al. McRank: Learning to Rank Using Multiple Classification and Gradient Boosting , 2007, NIPS.
[5] R. Srikant,et al. Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits , 2015, NIPS.
[6] Bo An,et al. Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty , 2018, IJCAI.
[7] Yiwei Zhang,et al. Reinforcement Mechanism Design for e-commerce , 2017, WWW.
[8] Zhao Li,et al. Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace , 2018, ESORICS.
[9] Luo Si,et al. Cascade Ranking for Operational E-commerce Search , 2017, KDD.
[10] Zhao Li,et al. Fraud Transaction Recognition: A Money Flow Network Approach , 2015, CIKM.
[11] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.
[12] Pingzhong Tang,et al. Reinforcement mechanism design , 2017, IJCAI.
[13] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[14] Ashish Kapoor,et al. Safety-Aware Algorithms for Adversarial Contextual Bandit , 2017, ICML.
[15] Csaba Szepesvári,et al. Online Learning to Rank in Stochastic Click Models , 2017, ICML.
[16] Yiwei Zhang,et al. Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce , 2018, AAAI.
[17] Wei Chu,et al. Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.
[18] Shipra Agrawal,et al. Thompson Sampling for Contextual Bandits with Linear Payoffs , 2012, ICML.
[19] Qing Wang,et al. Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit , 2016, KDD.
[20] Nikhil R. Devanur,et al. An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives , 2015, COLT.
[21] Ron Kohavi,et al. Online Controlled Experiments and A/B Testing , 2017, Encyclopedia of Machine Learning and Data Mining.
[22] Fuzhen Zhuang,et al. Policy Gradients for Contextual Bandits , 2018, ArXiv.
[23] M. J. Fryer,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.
[24] Yujing Hu,et al. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application , 2018, KDD.
[25] Aurélien Garivier,et al. Parametric Bandits: The Generalized Linear Case , 2010, NIPS.
[26] Andreas Krause,et al. Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.
[27] Pieter Abbeel,et al. Constrained Policy Optimization , 2017, ICML.
[28] Gregory N. Hullender,et al. Learning to rank using gradient descent , 2005, ICML.
[29] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[30] Yiqun Liu,et al. Detecting Crowdturfing "Add to Favorites" Activities in Online Shopping , 2018, WWW.
[31] Benjamin Van Roy,et al. Ensemble Sampling , 2017, NIPS.
[32] Xiang Li,et al. Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks , 2018, KDD.