Importance sampling for reinforcement learning with multiple objectives
暂无分享,去创建一个
[1] T. Kloek,et al. Bayesian estimates of equation system parameters, An application of integration by Monte Carlo , 1976 .
[2] E. H. Clarke. Incentives in public decision-making , 1980 .
[3] Y. Amihud,et al. Dealership market: Market-making with inventory , 1980 .
[4] T. Ho,et al. Optimal dealer pricing under transactions and return uncertainty , 1981 .
[5] Reuven Y. Rubinstein,et al. Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.
[6] T. Ho,et al. The Dynamics of Dealer Markets Under Competition , 1983 .
[7] Paul R. Milgrom,et al. Bid, ask and transaction prices in a specialist market with heterogeneously informed traders , 1985 .
[8] Maureen O'Hara,et al. The Microeconomics of Market Making , 1986, Journal of Financial and Quantitative Analysis.
[9] George E. P. Box,et al. Empirical Model‐Building and Response Surfaces , 1988 .
[10] M. Resnik. Choices: An Introduction to Decision Theory , 1990 .
[11] J. Geweke,et al. Bayesian Inference in Econometric Models Using Monte Carlo Integration , 1989 .
[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[13] William H. Press,et al. Numerical recipes , 1990 .
[14] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[15] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[16] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[17] T. Hesterberg,et al. Weighted Average Importance Sampling and Defensive Mixture Distributions , 1995 .
[18] A. Mas-Colell,et al. Microeconomic Theory , 1995 .
[19] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[20] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[21] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[22] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[23] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[24] Konkoly Thege. Multi-criteria Reinforcement Learning , 1998 .
[25] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[26] Satinder P. Singh,et al. Experimental Results on Learning Stochastic Memoryless Policies for Partially Observable Markov Decision Processes , 1998, NIPS.
[27] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[28] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[29] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[30] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[31] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[32] Christian R. Shelton,et al. Balancing Multiple Sources of Reward in Reinforcement Learning , 2000, NIPS.
[33] Andrew W. Moore,et al. A Nonparametric Approach to Noisy and Costly Optimization , 2000, ICML.
[34] Geoffrey J. Gordon. Reinforcement Learning with Function Approximation Converges to a Region , 2000, NIPS.
[35] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[36] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[37] Nicolas Meuleau,et al. Exploration in Gradient-Based Reinforcement Learning , 2001 .
[38] Leonid Peshkin,et al. Bounds on Sample Size for Policy Evaluation in Markov Environments , 2001, COLT/EuroCOLT.
[39] Peter Geibel,et al. Reinforcement Learning with Bounded Risk , 2001, ICML.
[40] Christian R. Shelton,et al. An Electronic Market-maker , 2001 .
[41] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[42] William H. Press,et al. Numerical recipes in C , 2002 .