Finding Approximate POMDP solutions Through Belief Compression
暂无分享,去创建一个
[1] A. Householder,et al. Discussion of a set of points in terms of their mutual distances , 1938 .
[2] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.
[3] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[4] R. Bellman. Dynamic programming. , 1957, Science.
[5] L. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .
[6] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[7] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[8] H. Akaike. A new look at the statistical model identification , 1974 .
[9] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[10] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[11] Dimitri P. Bertsekas,et al. Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[12] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[13] Lawrence R. Rabiner,et al. A tutorial on Hidden Markov Models , 1986 .
[14] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[15] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.
[16] P. McCullagh,et al. Generalized Linear Models , 1992 .
[17] Hsien-Te Cheng,et al. Algorithms for partially observable markov decision processes , 1989 .
[18] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[19] Ingemar J. Cox,et al. Autonomous Robot Vehicles , 1990, Springer New York.
[20] Sheryl R. YOUNG,et al. Use of dialogue, pragmatics and sematics to enhance speech recognition , 1990, Speech Commun..
[21] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[22] Jean-Claude Latombe,et al. Robot motion planning , 1991, The Kluwer international series in engineering and computer science.
[23] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[24] Hugh F. Durrant-Whyte,et al. Mobile robot localization by tracking geometric beacons , 1991, IEEE Trans. Robotics Autom..
[25] D. Moore. Simplicial Mesh Generation with Applications , 1992 .
[26] John A. Nelder,et al. Generalized linear models. 2nd ed. , 1993 .
[27] Jean-Claude Latombe,et al. Planning the Motions of a Mobile Robot in a Sensory Uncertainty Field , 1994, IEEE Trans. Pattern Anal. Mach. Intell..
[28] Gregory Dudek,et al. Precise positioning using model-based maps , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.
[29] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[30] Mark C. Torrance,et al. Natural communication with robots , 1994 .
[31] R. Simmons,et al. Probabilistic Navigation in Partially Observable Environments , 1995 .
[32] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[33] Stuart J. Russell,et al. Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.
[34] Teuvo Kohonen,et al. Self-Organizing Maps , 2010 .
[35] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[36] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning , 1995 .
[37] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[38] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[39] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[40] Illah R. Nourbakhsh,et al. DERVISH - An Office-Navigating Robot , 1995, AI Mag..
[41] Mosur Ravishankar,et al. Efficient Algorithms for Speech Recognition. , 1996 .
[42] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[43] Scott Davies,et al. Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.
[44] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[45] Wolfram Burgard,et al. Estimating the Absolute Position of a Mobile Robot Using Position Probability Grids , 1996, AAAI/IAAI, Vol. 2.
[46] Lydia E. Kavraki,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces , 1996, IEEE Trans. Robotics Autom..
[47] Liqiang Feng,et al. Navigating Mobile Robots: Systems and Techniques , 1996 .
[48] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[49] Leslie Pack Kaelbling,et al. Acting under uncertainty: discrete Bayesian models for mobile-robot navigation , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.
[50] Gregory Dudek,et al. Vision-based robot localization without explicit object models , 1996, Proceedings of IEEE International Conference on Robotics and Automation.
[51] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[52] Yasuhisa Niimi,et al. A dialog control strategy based on the reliability of speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[53] Alexander H. Waibel,et al. Dialogue strategies guiding users to their communicative goals , 1997, EUROSPEECH.
[54] Richard Washington,et al. BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning , 1997, ECP.
[55] B. S. Manjunath,et al. An Eigenspace Update Algorithm for Image Analysis , 1997, CVGIP Graph. Model. Image Process..
[56] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[57] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[58] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.
[59] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[60] Wolfram Burgard,et al. A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots , 1998, Auton. Robots.
[61] Wolfram Burgard,et al. Position Estimation for Mobile Robots in Dynamic Environments , 1998, AAAI/IAAI.
[62] Christopher M. Bishop,et al. GTM: The Generative Topographic Mapping , 1998, Neural Computation.
[63] Wolfram Burgard,et al. An experimental comparison of localization methods , 1998, Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190).
[64] Wolfram Burgard,et al. Active Markov localization for mobile robots , 1998, Robotics Auton. Syst..
[65] Gregory Dudek,et al. Mobile robot localization from learned landmarks , 1998, Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190).
[66] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[67] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[68] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[69] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[70] Hermann Ney,et al. Evaluating dialog systems used in the real world , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[71] Roberto Pieraccini,et al. Using Markov decision process for learning dialogue strategies , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[72] Alexander H. Waibel,et al. Towards spontaneous speech recognition for on-board car navigation and information systems , 1999, EUROSPEECH.
[73] Andrew W. Moore,et al. Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.
[74] Kee-Eung Kim,et al. Solving POMDPs by Searching the Space of Finite Policies , 1999, UAI.
[75] Antal van den Bosch. Instance-Family Abstraction in Memory-Based Language Learning , 1999, ICML.
[76] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
[77] Wolfram Burgard,et al. Coastal navigation-mobile robot navigation with uncertainty in dynamic environments , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).
[78] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[79] W. Burgard,et al. Markov Localization for Mobile Robots in Dynamic Environments , 1999, J. Artif. Intell. Res..
[80] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[81] Marilyn A. Walker,et al. Reinforcement Learning for Spoken Dialogue Systems , 1999, NIPS.
[82] Shimei Pan,et al. Empirically Evaluating an Adaptable Spoken Dialogue System , 1999, ArXiv.
[83] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[84] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[85] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[86] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[87] Wolfram Burgard,et al. Experiences with an Interactive Museum Tour-Guide Robot , 1999, Artif. Intell..
[88] Kurt Konolige,et al. A gradient method for realtime robot control , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).
[89] Clark F. Olson,et al. Probabilistic self-localization for mobile robots , 2000, IEEE Trans. Robotics Autom..
[90] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.
[91] Wolfram Burgard,et al. Probabilistic Algorithms and the Interactive Museum Tour-Guide Robot Minerva , 2000, Int. J. Robotics Res..
[92] Marilyn A. Walker,et al. Automatic Optimization of Dialogue Management , 2000, COLING.
[93] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[94] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[95] Alex Pentland,et al. EM for Perceptual Coding and Reinforcement Learning Tasks , 2000 .
[96] Ben M. Chen. Robust and H[∞] control , 2000 .
[97] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.
[98] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[99] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[100] Zhengzhu Feng,et al. Dynamic Programming for POMDPs Using a Factored State Representation , 2000, AIPS.
[101] Weihong Zhang,et al. Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes , 2011, J. Artif. Intell. Res..
[102] Eric A. Hansen,et al. An Improved Grid-Based Approximation Algorithm for POMDPs , 2001, IJCAI.
[103] Sanjoy Dasgupta,et al. A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.
[104] Geoffrey E. Hinton,et al. Global Coordination of Local Linear Models , 2001, NIPS.
[105] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[106] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[107] Wolfram Burgard,et al. Robust Monte Carlo localization for mobile robots , 2001, Artif. Intell..
[108] Bin Yu,et al. Model Selection and the Principle of Minimum Description Length , 2001 .
[109] S. LaValle,et al. Randomized Kinodynamic Planning , 2001 .
[110] Mukund Balasubramanian,et al. The isomap algorithm and topological stability. , 2002, Science.
[111] L. P. Kaelbling,et al. Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Topological-Geometrical Gap , 2011, J. Artif. Intell. Res..
[112] Geoffrey E. Hinton,et al. Stochastic Neighbor Embedding , 2002, NIPS.
[113] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[114] I. Jolliffe. Principal Component Analysis , 2002 .
[115] Geoffrey J. Gordon. Generalized^2 Linear^2 Models , 2002, NIPS 2002.
[116] Leslie Pack Kaelbling,et al. Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Geometrical-Topological Gap , 2002 .
[117] Nicholas Roy,et al. Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.
[118] William Whittaker,et al. Conditional particle filters for simultaneous mobile robot localization and people-tracking , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[119] Dieter Fox,et al. An experimental comparison of localization methods continued , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[120] Craig Boutilier,et al. Value-Directed Compression of POMDPs , 2002, NIPS.
[121] Sebastian Thrun,et al. Motion planning through policy search , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.
[122] Sridhar Mahadevan,et al. Hierarchical learning and planning in partially observable markov decision processes , 2002 .
[123] Joelle Pineau,et al. Policy-contingent abstraction for robust robot control , 2002, UAI.
[124] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[125] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[126] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.
[127] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.
[128] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[129] Michael Isard,et al. CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.
[130] Andrew W. Moore,et al. Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.
[131] R. Simmons,et al. The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms , 2004, Machine Learning.
[132] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.
[133] M. E. Galassi,et al. GNU SCIENTI C LIBRARY REFERENCE MANUAL , 2005 .
[134] RoyNicholas,et al. Finding approximate POMDP solutions through belief compression , 2005 .
[135] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[136] Gene H. Golub,et al. Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.