Natural Gradient Policy for Average Cost SMDP Problem
暂无分享,去创建一个
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] Morton E. O'Kelly,et al. Detecting outliers in irregularly distributed spatial data sets by locally adaptive and robust statistical analysis and GIS , 2001, Int. J. Geogr. Inf. Sci..
[4] Vijayalakshmi Atluri,et al. Neighborhood based detection of anomalies in high dimensional spatio-temporal sensor datasets , 2004, SAC '04.
[5] Shun-ichi Amari,et al. A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..
[6] Chang-Tien Lu,et al. Detecting spatial outliers with multiple attributes , 2003, Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence.
[7] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[8] Chang-Tien Lu,et al. Algorithms for spatial outlier detection , 2003, Third IEEE International Conference on Data Mining.
[9] Shashi Shekhar,et al. Detecting graph-based spatial outliers: algorithms and applications (a summary of results) , 2001, KDD '01.
[10] W. Tobler. A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .
[11] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[12] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[13] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[14] Kenji Fukumizu,et al. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons , 2000, Neural Computation.
[15] Michael Ian Shamos,et al. Computational geometry: an introduction , 1985 .
[16] Graham J. Williams,et al. On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms , 2000, KDD '00.
[17] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[18] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] John N. Tsitsiklis,et al. Gradient Convergence in Gradient methods with Errors , 1999, SIAM J. Optim..
[21] John N. Tsitsiklis,et al. Call admission control and routing in integrated services networks using reinforcement learning , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[22] T. Moon,et al. Mathematical Methods and Algorithms for Signal Processing , 1999 .
[23] Keith W. Ross,et al. Multiservice Loss Models for Broadband Telecommunication Networks , 1997 .
[24] P. Rousseeuw,et al. Computing depth contours of bivariate point clouds , 1996 .
[25] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[26] Arnaud Doucet,et al. A policy gradient method for semi-Markov decision processes with application to call admission control , 2007, Eur. J. Oper. Res..
[27] Chang-Tien Lu,et al. Detecting region outliers in meteorological data , 2003, GIS '03.
[28] Marco Riani,et al. The Ordering of Spatial Data and the Detection of Multiple Outliers , 1999 .
[29] Graham J. Wills,et al. Dynamic Graphics for Exploring Spatial Data with Application to Locating Global and Local Anomalies , 1991 .
[30] Abhijit Gosavi,et al. Reinforcement learning for long-run average cost , 2004, Eur. J. Oper. Res..
[31] D. Blackwell. Discounted Dynamic Programming , 1965 .
[32] Jeff G. Schneider,et al. Covariant policy search , 2003, IJCAI 2003.
[33] Robert Haining,et al. Spatial Data Analysis in the Social and Environmental Sciences , 1990 .
[34] H. Vincent Poor,et al. Integrated voice/data call admission control for wireless DS-CDMA systems , 2002, IEEE Trans. Signal Process..
[35] D. Blackwell. Discrete Dynamic Programming , 1962 .
[36] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[37] Samuel Karlin,et al. The structure of dynamic programing models , 1955 .
[38] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.