Introduction

This is the third special issue of Machine Learning on the subject of reinforcement learning (the first and second special issues were edited by Richard Sutton in 1992 and Leslie Kaelbling in 1996 respectively). The field of reinforcement learning continues to grow, attracting ideas and participants not only from AI and machine learning but also from neuroscience, cognitive science, operations research, and control. More than a decade of the resulting research has led to great progress in the theoretical underpinnings of the field, much of it derived from the theory of dynamic programming and the associated frameworks of Markov Decision Processes (MDPs), semi-MDPs and partially observable MDPs (POMDPs) (see texts by Sutton and Barto (1998) and Bertsekas and Tsitsiklis (1996) for excellent overviews). Much has been accomplished, and yet, of course, much remains to be done. I will take advantage of this guest editorial to outline my general views on three of the open issues that are key to further rapid progress in reinforcement learning, and then turn to very briefly survey the papers in this special issue.

[1]  M. C. Sheps,et al.  Truncation effect in closed and open birth interval data. , 1970, Journal of the American Statistical Association.

[2]  K. Srinivasan Birth interval analysis in fertility surveys , 1980 .

[3]  Singh Sn,et al.  A parity dependent model for open birth interval , 1982 .

[4]  D. N. Pandey Open birth interval as an index of fertility. , 1982, Rural demography.

[5]  G. Feeney,et al.  Population dynamics based on birth intervals and parity progression. , 1983, Population studies.

[6]  G. Feeney,et al.  Analysing open birth interval distributions. , 1984, Population studies.

[7]  K. Srinivasan,et al.  On estimating age specific fecundability and secondary sterility from the data on open and last closed birth intervals. , 1987 .

[8]  On the estimation of parity progression and instantaneous parity progression ratios. , 1989 .

[9]  A. Pandey,et al.  Estimation of parity progression ratios from the truncated distribution of closed and open birth intervals. , 1992, Mathematical biosciences.

[10]  L. Aiello,et al.  Vascular endothelial growth factor in ocular fluid of patients with diabetic retinopathy and other retinal disorders. , 1994, The New England journal of medicine.

[11]  Andrew McCallum,et al.  Reinforcement learning with selective perception and hidden state , 1996 .

[12]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[13]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[14]  C. Schmertmann,et al.  Estimating parametric fertility models with open birth interval data. , 1999, Demographic research.

[15]  Russell V. Lenth,et al.  Some Practical Guidelines for Effective Sample Size Determination , 2001 .

[16]  J. Rakic,et al.  Placental growth factor, a member of the VEGF family, contributes to the development of choroidal neovascularization. , 2003, Investigative ophthalmology & visual science.

[17]  R. Klein,et al.  Causes and prevalence of visual impairment among adults in the United States. , 2004, Archives of ophthalmology.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  K. Eng,et al.  Ranibizumab in neovascular age-related macular degeneration , 2006, Clinical interventions in aging.