Exponential Family Predictive Representations of State
暂无分享,去创建一个
[1] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[2] Solomon Kullback,et al. Information Theory and Statistics , 1960 .
[3] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[4] A. Rényi. On Measures of Entropy and Information , 1961 .
[5] D. Blackwell. Discrete Dynamic Programming , 1962 .
[6] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[7] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[8] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[9] L. Glass,et al. Oscillation and chaos in physiological control systems. , 1977, Science.
[10] Gene H. Golub,et al. Matrix computations , 1983 .
[11] L. Rabiner,et al. An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.
[12] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[13] Daniel E. Lane,et al. A Partially Observable Model of Decision Making by Fishermen , 1989, Oper. Res..
[14] Gregory F. Cooper,et al. The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..
[15] E. Jaynes,et al. NOTES ON PRESENT STATUS AND FUTURE PROSPECTS , 1991 .
[16] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[17] Andreas Stolcke,et al. Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.
[18] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[19] D. Haussler,et al. Protein modeling using hidden Markov models: analysis of globins , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.
[20] Jianqing Fan,et al. Local polynomial modelling and its applications , 1994 .
[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[22] Andreas S. Weigend,et al. Time Series Prediction: Forecasting the Future and Understanding the Past , 1994 .
[23] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[24] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[25] Stuart J. Russell,et al. Adaptive Probabilistic Networks , 1994 .
[26] M. Littman. The Witness Algorithm: Solving Partially Observable Markov Decision Processes , 1994 .
[27] Jagat Narain Kapur,et al. Measures of information and their applications , 1994 .
[28] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .
[29] Leslie Pack Kaelbling,et al. Learning Dynamics: System Identification for Perceptually Challenged Agents , 1995, Artif. Intell..
[30] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[31] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[32] John D. Lafferty,et al. Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..
[33] John Loch,et al. Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes , 1998, ICML.
[34] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[35] Eric A. Hansen,et al. Solving POMDPs by Searching in Policy Space , 1998, UAI.
[36] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[37] J. Crutchfield,et al. Computational Mechanics: Pattern and Prediction, Structure and Simplicity , 1999, ArXiv.
[38] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[39] Andrew McCallum,et al. Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.
[40] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[41] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[42] Charu C. Aggarwal,et al. On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.
[43] Rudolph van der Merwe,et al. The square-root unscented Kalman filter for state and parameter-estimation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[44] Tom Minka,et al. A family of algorithms for approximate Bayesian inference , 2001 .
[45] Deniz Erdoğmuş,et al. Blind source separation using Renyi's mutual information , 2001, IEEE Signal Processing Letters.
[46] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[47] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[48] L. P. Kaelbling,et al. Learning Geometrically-Constrained Hidden Markov Models for Robot Navigation: Bridging the Topological-Geometrical Gap , 2011, J. Artif. Intell. Res..
[49] Daniel Nikovski,et al. State-aggregation algorithms for learning probabilistic models for robot control , 2002 .
[50] Hugh F. Durrant-Whyte,et al. Simultaneous Mapping and Localization with Sparse Extended Information Filters: Theory and Initial Results , 2004, WAFR.
[51] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[52] William T. Freeman,et al. Understanding belief propagation and its generalizations , 2003 .
[53] H. Jaeger. Discrete-time, discrete-valued observable operator models: a tutorial , 2003 .
[54] Ali H. Sayed,et al. Fundamentals Of Adaptive Filtering , 2003 .
[55] Kari Torkkola,et al. Feature Extraction by Non-Parametric Mutual Information Maximization , 2003, J. Mach. Learn. Res..
[56] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[57] Shie Mannor,et al. Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning , 2003, ICML.
[58] Yun He,et al. A generalized divergence measure for robust image registration , 2003, IEEE Trans. Signal Process..
[59] A. Cassandra. A Survey of POMDP Applications , 2003 .
[60] M. Lesperance,et al. PIECEWISE REGRESSION: A TOOL FOR IDENTIFYING ECOLOGICAL THRESHOLDS , 2003 .
[61] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[62] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[63] Peter Stone,et al. Learning Predictive State Representations , 2003, ICML.
[64] Wolfram Burgard,et al. Autonomous exploration and mapping of abandoned mines , 2004, IEEE Robotics & Automation Magazine.
[65] Shie Mannor,et al. The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.
[66] Michael R. James,et al. Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.
[67] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[68] Pieter Bram Bakker,et al. The state of mind : reinforcement learning with recurrent neural networks , 2004 .
[69] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[70] Sebastian Thrun,et al. Learning low dimensional predictive representations , 2004, ICML.
[71] Richard S. Sutton,et al. Temporal-Difference Networks , 2004, NIPS.
[72] Marco Wiering,et al. Utile distinction hidden Markov models , 2004, ICML.
[73] John Platt,et al. FastMap, MetricMap, and Landmark MDS are all Nystrom Algorithms , 2005, AISTATS.
[74] Michael R. James,et al. Planning in Models that Combine Memory with Predictive Representations of State , 2005, AAAI.
[75] Udo Frese. A Proof for the Approximate Sparsity of SLAM Information Matrices , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[76] Matthew R. Rudary,et al. Predictive Linear-Gaussian Models of Stochastic Dynamical Systems , 2005, UAI.
[77] Satinder Singh Baveja,et al. Using predictions for planning and modeling in stochastic environments , 2005 .
[78] Richard S. Sutton,et al. Temporal-Difference Networks with History , 2005, IJCAI.
[79] Richard S. Sutton,et al. TD(λ) networks: temporal-difference networks with eligibility traces , 2005, ICML.
[80] Eric Wiewiora,et al. Learning predictive representations from a history , 2005, ICML.
[81] Michael R. James,et al. Combining Memory and Landmarks with Predictive State Representations , 2005, IJCAI.
[82] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[83] Richard S. Sutton,et al. Using Predictive Representations to Improve Generalization in Reinforcement Learning , 2005, IJCAI.
[84] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.
[85] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[86] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[87] Michael H. Bowling,et al. Online Discovery and Learning of Predictive State Representations , 2005, NIPS.
[88] A. Ben Hamza,et al. Nonextensive information-theoretic measure for image edge detection , 2006, J. Electronic Imaging.
[89] Satinder P. Singh,et al. Mixtures of Predictive Linear Gaussian Models for Nonlinear, Stochastic Dynamical Systems , 2006, AAAI.
[90] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.
[91] Satinder P. Singh,et al. Predictive state representations with options , 2006, ICML.
[92] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.
[93] Martin J. Wainwright,et al. Log-determinant relaxation for approximate inference in discrete Markov random fields , 2006, IEEE Transactions on Signal Processing.
[94] Satinder P. Singh,et al. Kernel Predictive Linear Gaussian models for nonlinear stochastic dynamical systems , 2006, ICML.
[95] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[96] A. Willsky,et al. Maximum Entropy Relaxation for Graphical Model Selection Given Inconsistent Statistics , 2007, 2007 IEEE/SP 14th Workshop on Statistical Signal Processing.
[97] Satinder P. Singh,et al. On discovery and learning of models with predictive representations of state for agents with continuous actions and observations , 2007, AAMAS '07.
[98] Jesse Hoey,et al. Assisting persons with dementia during handwashing using a partially observable Markov decision process. , 2007, ICVS 2007.
[99] Satinder P. Singh,et al. Efficiently learning linear-linear exponential family predictive representations of state , 2008, ICML '08.
[100] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..