Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing
暂无分享,去创建一个
[1] Bo Wahlberg,et al. Recursive identification of chain dynamics in Hidden Markov Models using Non-Negative Matrix Factorization , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).
[2] Vikram Krishnamurthy,et al. Myopic Bounds for Optimal Policy of POMDPs: An Extension of Lovejoy's Structural Results , 2014, Oper. Res..
[3] Vikram Krishnamurthy,et al. Online Reputation and Polling Systems: Data Incest, Social Learning, and Revealed Preferences , 2014, IEEE Transactions on Computational Social Systems.
[4] Cristian R. Rojas,et al. Reduced Complexity HMM Filtering With Stochastic Dominance Bounds: A Convex Optimization Approach , 2014, IEEE Transactions on Signal Processing.
[5] Ali Sayed,et al. Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..
[6] Vikram Krishnamurthy,et al. Interactive Sensing and Decision Making in Social Networks , 2014, Found. Trends Signal Process..
[7] Nicole Bäuerle,et al. More Risk-Sensitive Markov Decision Processes , 2014, Math. Oper. Res..
[8] Özlem Çavus,et al. Risk-Averse Control of Undiscounted Transient Markov Models , 2012, SIAM J. Control. Optim..
[9] Fernando Vega-Redondo,et al. Complex Social Networks: Searching in Social Networks , 2007 .
[10] Bo Wahlberg,et al. Computing monotone policies for Markov decision processes by exploiting sparsity , 2013, 2013 Australian Control Conference.
[11] Vikram Krishnamurthy. How to Schedule Measurements of a Noisy Markov Chain in Decision Making? , 2013, IEEE Transactions on Information Theory.
[12] H. Vincent Poor,et al. Social learning and bayesian games in multiagent signal processing: how do local and global decision makers interact? , 2013, IEEE Signal Processing Magazine.
[13] Anna Scaglione,et al. Models for the Diffusion of Beliefs in Social Networks: An Overview , 2013, IEEE Signal Processing Magazine.
[14] Gang George Yin,et al. Distributed Tracking of Correlated Equilibria in Regime Switching Noncooperative Games , 2013, IEEE Transactions on Automatic Control.
[15] Langford B. White,et al. Maximum Likelihood Sequence Estimation for Hidden Reciprocal Processes , 2013, IEEE Transactions on Automatic Control.
[16] Gábor Lugosi,et al. Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.
[17] Tal Ben-Zvi,et al. Partially Observed Markov Decision Processes with Binomial Observations , 2013, Oper. Res. Lett..
[18] Vikram Krishnamurthy,et al. Detection of Anomalous Trajectory Patterns in Target Tracking via Stochastic Context-Free Grammars and Reciprocal Process Models , 2013, IEEE Journal of Selected Topics in Signal Processing.
[19] Milica Gasic,et al. POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.
[20] Michel Benaïm,et al. Consistency of Vanishingly Smooth Fictitious Play , 2011, Math. Oper. Res..
[21] Guy Shani,et al. Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .
[22] Evangelos Markakis,et al. A Game-Theoretic Analysis of a Competitive Diffusion Process over Social Networks , 2012, WINE.
[23] Cameron Marlow,et al. A 61-million-person experiment in social influence and political mobilization , 2012, Nature.
[24] Bruno Strulovici,et al. Aggregating the single crossing property , 2012 .
[25] Samuel N. Cohen,et al. Stochastic Processes, Finance And Control: A Festschrift In Honor Of Robert J Elliott , 2012 .
[26] Hal R. Varian,et al. Revealed Preference and its Applications , 2012 .
[27] Tara Javidi,et al. Active Sequential Hypothesis Testing , 2012, ArXiv.
[28] Shipra Agrawal,et al. Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.
[29] Aleksey S. Polunchenko,et al. State-of-the-Art in Sequential Change-Point Detection , 2011, 1109.2938.
[30] JOHN K.-H. QUAH,et al. AGGREGATING THE SINGLE CROSSING PROPERTY BY JOHN K.-H. QUAH , 2012 .
[31] Maxim Raginsky,et al. Shannon meets Blackwell and Le Cam: Channels, codes, and statistical experiments , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.
[32] Vikram Krishnamurthy,et al. Bayesian Sequential Detection With Phase-Distributed Change Time and Nonlinear Penalty—A POMDP Lattice Programming Approach , 2011, IEEE Transactions on Information Theory.
[33] N. Higham,et al. On pth Roots of Stochastic Matrices , 2011 .
[34] Filip Matejka,et al. Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model , 2011 .
[35] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[36] Taposh Banerjee,et al. Data-Efficient Quickest Change Detection with On–Off Observation Control , 2011, ArXiv.
[37] Ba Di Ya,et al. Matrix Analysis , 2011 .
[38] Boleslaw K. Szymanski,et al. Social consensus through the influence of committed minorities , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.
[39] Asuman E. Ozdaglar,et al. Opinion Dynamics and Learning in Social Networks , 2010, Dyn. Games Appl..
[40] Dimitri P. Bertsekas,et al. Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, 49th IEEE Conference on Decision and Control (CDC).
[41] Andrzej Ruszczynski,et al. Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..
[42] G. Moustakides,et al. State-of-the-Art in Bayesian Changepoint Detection , 2010 .
[43] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.
[44] Qing Zhao,et al. Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access , 2008, IEEE Transactions on Information Theory.
[45] Vikram Krishnamurthy,et al. Monotonicity of Constrained Optimal Transmission Policies in Correlated Fading Channels With ARQ , 2010, IEEE Transactions on Signal Processing.
[46] Vikram Krishnamurthy,et al. Optimal Threshold Policies for Multivariate POMDPs in Radar Resource Management , 2009, IEEE Transactions on Signal Processing.
[47] Sunghee Lee. Understanding Respondent Driven Sampling from a Total Survey Error Perspective , 2009 .
[48] Vikram Krishnamurthy,et al. Optimality of threshold policies for transmission scheduling in correlated fading channels , 2009, IEEE Transactions on Communications.
[49] Lieven Vandenberghe,et al. Interior-Point Method for Nuclear Norm Approximation with Application to System Identification , 2009, SIAM J. Matrix Anal. Appl..
[50] Matthew J. Salganik,et al. Respondent‐driven sampling as Markov chain Monte Carlo , 2009, Statistics in medicine.
[51] Hector Geffner,et al. A Translation-Based Approach to Contingent Planning , 2009, IJCAI.
[52] Venugopal V. Veeravalli,et al. Bayesian quickest change process detection , 2009, 2009 IEEE International Symposium on Information Theory.
[53] Michael L. Littman,et al. A tutorial on partially observable Markov decision processes , 2009 .
[54] S. Haykin,et al. Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.
[55] Bo Wahlberg,et al. Partially Observed Markov Decision Process Multiarmed Bandits - Structural Results , 2009, Math. Oper. Res..
[56] Gang George Yin,et al. How does a stochastic optimization/approximation algorithm adapt to a randomly evolving optimum/root with jump Markov sample paths , 2009, Math. Program..
[57] Mirjam Dür,et al. An Adaptive Linear Approximation Algorithm for Copositive Programs , 2009, SIAM J. Optim..
[58] Sham M. Kakade,et al. A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..
[59] Dimitri P. Bertsekas,et al. Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.
[60] Luca Maria Gambardella,et al. A survey on metaheuristics for stochastic combinatorial optimization , 2009, Natural Computing.
[61] G. Casella,et al. The Bayesian Lasso , 2008 .
[62] Ali H. Sayed,et al. Adaptive Filters , 2008 .
[63] Mirjam Dür,et al. Algorithmic copositivity detection by simplicial partition , 2008 .
[64] Dunia López-Pintado,et al. Diffusion in complex social networks , 2008, Games Econ. Behav..
[65] K. Ramanan,et al. Concentration Inequalities for Dependent Random Variables via the Martingale Method , 2006, math/0609835.
[66] Tsachy Weissman,et al. Universal Filtering Via Hidden Markov Modeling , 2008, IEEE Transactions on Information Theory.
[67] H. Vincent Poor,et al. Quickest Detection: Probabilistic framework , 2008 .
[68] A. Doucet,et al. A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .
[69] Bruno Strulovici,et al. Comparative Statics, Informativeness, and the Interval Dominance Order , 2009 .
[70] Peter W. Glynn,et al. Proceedings of the 2nd international conference on Performance evaluation methodologies and tools , 2007 .
[71] Abraham Grosfeld-Nir,et al. Control limits for two-state partially observable Markov decision processes , 2007, Eur. J. Oper. Res..
[72] Vikram Krishnamurthy,et al. Structured Threshold Policies for Dynamic Sensor Scheduling—A Partially Observed Markov Decision Process Approach , 2007, IEEE Transactions on Signal Processing.
[73] Leonard Rogers,et al. VALUATIONS AND DYNAMIC CONVEX RISK MEASURES , 2007, 0709.0232.
[74] John W. Fisher,et al. Approximate Dynamic Programming for Communication-Constrained Sensor Network Management , 2007, IEEE Transactions on Signal Processing.
[75] Vikram Krishnamurthy,et al. Decentralized Activation in a ZigBee-enabled Unattended Ground Sensor Network: A Correlated Equilibrium Game Theoretic Analysis , 2007, 2007 IEEE International Conference on Communications.
[76] Subhrakanti Dey,et al. Stability of Kalman filtering with Markovian packet losses , 2007, Autom..
[77] Ananthram Swami,et al. Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.
[78] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[79] Guy Shani,et al. Forward Search Value Iteration for POMDPs , 2007, IJCAI.
[80] A. Lansky,et al. Developing an HIV Behavioral Surveillance System for Injecting Drug Users: The National HIV Behavioral Surveillance System , 2007, Public health reports.
[81] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[82] Vikram Krishnamurthy,et al. Opportunistic file transfer over a fading channel: A POMDP search theory formulation with optimal threshold policies , 2006, IEEE Transactions on Wireless Communications.
[83] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..
[84] L. Platzman. Optimal Infinite-Horizon Undiscounted Control of Finite Probabilistic Systems , 2006 .
[85] Ari Arapostathis,et al. On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion , 2006, Syst. Control. Lett..
[86] Lones Smith,et al. Informational Herding and Optimal Experimentation , 2006 .
[87] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[88] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[89] Eric Moulines,et al. Inference in hidden Markov models , 2010, Springer series in statistics.
[90] S. Ethier,et al. Markov Processes: Characterization and Convergence , 2005 .
[91] Yaakov Oshman,et al. A Crame/spl acute/r-Rao-type estimation lower bound for systems with measurement faults , 2005, IEEE Transactions on Automatic Control.
[92] Gang George Yin,et al. LMS algorithms for tracking slow Markov chains with applications to hidden Markov estimation and adaptive multiuser detection , 2005, IEEE Transactions on Information Theory.
[93] Xin Guo,et al. On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.
[94] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..
[95] Nikos A. Vlassis,et al. Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..
[96] Robin J. Evans,et al. Networked sensor management and data rate control for tracking maneuvering targets , 2005, IEEE Transactions on Signal Processing.
[97] L. Vesterlund,et al. Dynamic Monopoly Pricing and Herding , 2005 .
[98] Simon Haykin,et al. Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.
[99] Shlomo Shamai,et al. Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.
[100] V. Veeravalli,et al. General Asymptotic Bayesian Theory of Quickest Change Detection , 2005 .
[101] Dimitri P. Bertsekas,et al. Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC , 2005, Eur. J. Control.
[102] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[103] Christian P. Robert,et al. Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.
[104] G. Yin,et al. Discrete-Time Markov Chains: Two-Time-Scale Methods and Applications , 2004 .
[105] R. Douc,et al. Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime , 2004, math/0503681.
[106] Richard M. Murray,et al. Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.
[107] P. Moral. Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .
[108] Branko Ristic,et al. Beyond the Kalman Filter: Particle Filters for Tracking Applications , 2004 .
[109] Pierre Hansen,et al. On the geometry of Nash equilibria and correlated equilibria , 2003, Int. J. Game Theory.
[110] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[111] P. Kantor. Foundations of Statistical Natural Language Processing , 2001, Information Retrieval.
[112] Gang George Yin,et al. Regime Switching Stochastic Approximation Algorithms with Application to Adaptive Discrete Stochastic Optimization , 2004, SIAM J. Optim..
[113] Stephen P. Boyd,et al. Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..
[114] Eitan Altman,et al. Discrete-Event Control of Stochastic Networks - Multimodularity and Regularity , 2004, Lecture notes in mathematics.
[115] R. Amir. Supermodularity and Complementarity in Economics: An Elementary Survey , 2003 .
[116] C. Chamley. Rational Herds: Economic Models of Social Learning , 2003 .
[117] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[118] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[119] Luc Vandendorpe,et al. Turbo synchronization: an EM algorithm interpretation , 2003, IEEE International Conference on Communications, 2003. ICC '03..
[120] M. Benaïm,et al. Deterministic Approximation of Stochastic Evolution in Games , 2003 .
[121] M. J. Todd,et al. Two new proofs of Afriat’s theorem , 2003 .
[122] Konstantinos V. Katsikopoulos,et al. Markov decision processes with delays and asynchronous cost collection , 2003, IEEE Trans. Autom. Control..
[123] Stéphane Boucheron,et al. Optimal error exponents in hidden Markov models order estimation , 2003, IEEE Trans. Inf. Theory.
[124] C. Sims. Implications of rational inattention , 2003 .
[125] Vikram Krishnamurthy,et al. The optimal search for a Markovian target when the search path is constrained: the infinite-horizon case , 2003, IEEE Trans. Autom. Control..
[126] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[127] Frank Riedel,et al. Dynamic Coherent Risk Measures , 2003 .
[128] A. Cassandra. A Survey of POMDP Applications , 2003 .
[129] William H. Sandholm,et al. ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .
[130] James E. Smith,et al. Structural Properties of Stochastic Dynamic Programs , 2002, Oper. Res..
[131] Douglas Aberdeen,et al. Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.
[132] Vikram Krishnamurthy,et al. Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..
[133] Neri Merhav,et al. Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.
[134] Ian F. Akyildiz,et al. Wireless sensor networks: a survey , 2002, Comput. Networks.
[135] A. Müller,et al. Comparison Methods for Stochastic Models and Risks , 2002 .
[136] Douglas D. Heckathorn,et al. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hi , 2002 .
[137] S. Athey. Monotone Comparative Statics under Uncertainty , 2002 .
[138] Peter L. Bartlett,et al. Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning , 2000, J. Comput. Syst. Sci..
[139] Eugene A. Feinberg,et al. Handbook of Markov Decision Processes , 2002 .
[140] George Casella,et al. Implementations of the Monte Carlo EM Algorithm , 2001 .
[141] P. Sørensen,et al. Information aggregation in debate: who should speak first? ☆ , 2001 .
[142] Ravindra K. Ahuja,et al. Inverse Optimization , 2001, Oper. Res..
[143] S. Morris,et al. Global Games: Theory and Applications , 2001 .
[144] Thiagalingam Kirubarajan,et al. Estimation with Applications to Tracking and Navigation , 2001 .
[145] Arnaud Doucet,et al. Particle filters for state estimation of jump Markov linear systems , 2001, IEEE Trans. Signal Process..
[146] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[147] Xiao-Li Meng,et al. The Art of Data Augmentation , 2001 .
[148] Alessandro Vespignani,et al. Epidemic spreading in scale-free networks. , 2000, Physical review letters.
[149] Jun S. Liu,et al. Monte Carlo strategies in scientific computing , 2001 .
[150] H. V. Trees. Detection, Estimation, And Modulation Theory , 2001 .
[151] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[152] Jinwen Ma,et al. Asymptotic Convergence Rate of the EM Algorithm for Gaussian Mixtures , 2000, Neural Computation.
[153] John Odentrantz,et al. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.
[154] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[155] Simon J. Godsill,et al. On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..
[156] David Q. Mayne,et al. Constrained model predictive control: Stability and optimality , 2000, Autom..
[157] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[158] R. Rockafellar,et al. Optimization of conditional value-at risk , 2000 .
[159] Moshe Pollak,et al. Detecting a change in regression: first-order optimality , 1999 .
[160] Sigrún Andradóttir,et al. Accelerating the convergence of random search methods for discrete stochastic optimization , 1999, TOMC.
[161] Christos G. Cassandras,et al. Introduction to Discrete Event Systems , 1999, The Kluwer International Series on Discrete Event Dynamic Systems.
[162] Samuel S. Blackman,et al. Design and Analysis of Modern Tracking Systems , 1999 .
[163] Vikram Krishnamurthy,et al. Expectation maximization algorithms for MAP estimation of jump Markov linear systems , 1999, IEEE Trans. Signal Process..
[164] Philippe Artzner,et al. Coherent Measures of Risk , 1999 .
[165] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .
[166] E. Altman. Constrained Markov Decision Processes , 1999 .
[167] Alf Isaksson,et al. On sensor scheduling via information theoretic criteria , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).
[168] J. Booth,et al. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .
[169] A Orman,et al. Optimization of Stochastic Models: The Interface Between Simulation and Optimization , 2012, J. Oper. Res. Soc..
[170] H. Poor. Quickest detection with exponential penalty for delay , 1998 .
[171] L. Sennott. Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .
[172] Sergio Verdu,et al. Multiuser Detection , 1998 .
[173] Carlos H. Muravchik,et al. Posterior Cramer-Rao bounds for discrete-time nonlinear filtering , 1998, IEEE Trans. Signal Process..
[174] D. M. Topkis. Supermodularity and Complementarity , 1998 .
[175] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[176] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[177] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[178] F. Gland,et al. Exponential Forgetting and Geometric Ergodicity in Hidden Markov Models , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[179] Alfred Müller,et al. How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities? , 1997, Math. Oper. Res..
[180] Robert J. Elliott,et al. Exact Finite-Dimensional Filters for Maximum Likelihood Parameter Estimation of Continuous-time Linear Gaussian Systems , 1997 .
[181] B. M. Pötscher,et al. Dynamic Nonlinear Econometric Models: Asymptotic Theory , 1997 .
[182] Douglas D. Heckathorn,et al. Respondent-driven sampling : A new approach to the study of hidden populations , 1997 .
[183] R. Atar,et al. Lyapunov Exponents for Finite State Nonlinear Filtering , 1997 .
[184] Masaaki Kijima,et al. Markov processes for stochastic modeling , 1997 .
[185] Jun S. Liu,et al. Sequential Monte Carlo methods for dynamic systems , 1997 .
[186] W. Chiou. A note on estimation algebras on nonlinear filtering theory , 1996 .
[187] Demosthenis Teneketzis,et al. Measurement scheduling for recursive team estimation , 1996 .
[188] Sigrún Andradóttir,et al. A Global Search Method for Discrete Stochastic Optimization , 1996, SIAM J. Optim..
[189] Stephen P. Boyd,et al. Semidefinite Programming , 1996, SIAM Rev..
[190] Vikram Krishnamurthy,et al. Time discretization of continuous-time filters and smoothers for HMM parameter estimation , 1996, IEEE Trans. Inf. Theory.
[191] J. Doyle,et al. Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.
[192] Thomas F. Coleman,et al. An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds , 1993, SIAM J. Optim..
[193] Michael I. Jordan,et al. On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.
[194] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[195] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[196] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[197] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[198] Luca Maria Gambardella,et al. Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.
[199] D. Fudenberg,et al. Consistency and Cautious Fictitious Play , 1995 .
[200] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[201] Benjamin Paul Jordan. On optimal search for a moving target , 1995 .
[202] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .
[203] John B. Moore,et al. Hidden Markov Models: Estimation and Control , 1994 .
[204] D. Rubin,et al. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .
[205] L. Tierney. Markov Chains for Exploring Posterior Distributions , 1994 .
[206] Alfred O. Hero,et al. Space-alternating generalized expectation-maximization algorithm , 1994, IEEE Trans. Signal Process..
[207] James D. Hamilton,et al. Autoregressive conditional heteroskedasticity and changes in regime , 1994 .
[208] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[209] Michael Pinedo,et al. Scheduling: Theory, Algorithms, and Systems , 1994 .
[210] Xuan Kong,et al. Adaptive Signal Processing Algorithms: Stability and Performance , 1994 .
[211] Rudi Zagst,et al. Monotonicity and bounds for convex stochastic control models , 1994, Math. Methods Oper. Res..
[212] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[213] M. James,et al. Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems , 1994, IEEE Trans. Autom. Control..
[214] Xiao-Li Meng,et al. On the rate of convergence of the ECM algorithm , 1994 .
[215] H. Poor. An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.
[216] Paul R. Milgrom,et al. Monotone Comparative Statics , 1994 .
[217] D. Sworder,et al. Image-enhanced estimation methods , 1993, Proc. IEEE.
[218] William S. Lovejoy,et al. Suboptimal Policies, with Bounds, for Parameter Adaptive Decision Processes , 1993, Oper. Res..
[219] N. Gordon,et al. Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .
[220] M. K. Ghosh,et al. Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .
[221] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[222] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[223] T. Lindvall. Lectures on the Coupling Method , 1992 .
[224] A. Bensoussan. Stochastic Control of Partially Observable Systems , 1992 .
[225] A. Banerjee,et al. A Simple Model of Herd Behavior , 1992 .
[226] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[227] S. Bikhchandani,et al. You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .
[228] Koji Iida,et al. Studies on the Optimal Search Plan , 1992 .
[229] B. Leroux. Maximum-likelihood estimation for hidden Markov models , 1992 .
[230] Ulrich Rieder,et al. Structural results for partially observed control models , 1991, ZOR Methods Model. Oper. Res..
[231] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[232] B. Conolly. Structured Stochastic Matrices of M/G/1 Type and Their Applications , 1991 .
[233] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[234] D. W. Lewis. Matrix theory , 1991 .
[235] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[236] G. Barles,et al. Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.
[237] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[238] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .
[239] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[240] J. Mendel,et al. Maximum-Likelihood Deconvolution: A Journey into Model-Based Signal Processing , 1990 .
[241] A. Bensoussan,et al. Optimal sensor scheduling in nonlinear filtering of diffusion processes , 1989 .
[242] Lawrence D. Stone. OR Forum - What's Happened in Search Theory Since the 1975 Lanchester Prize? , 1989, Oper. Res..
[243] D. Ghosh. Maximum likelihood estimation of the dynamic shock-error model , 1989 .
[244] E. Weinstein,et al. A new method for evaluating the log-likelihood gradient, the Hessian, and the Fisher information matrix for linear dynamic systems , 1989, IEEE Trans. Inf. Theory.
[245] Jerzy A. Filar,et al. Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..
[246] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[247] E. Hannan,et al. The statistical theory of linear systems , 1989 .
[248] P. Caines. Linear Stochastic Systems , 1988 .
[249] R. Mohler,et al. Nonlinear data observability and information , 1988 .
[250] J. George Shanthikumar,et al. DFR Property of First-Passage Times and its Preservation Under Geometric Compounding , 1988 .
[251] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.
[252] William S. Lovejoy,et al. Some Monotonicity Results for Partially Observed Markov Decision Processes , 1987, Oper. Res..
[253] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[254] C. N. Morris,et al. The calculation of posterior distributions by data augmentation , 1987 .
[255] William S. Lovejoy. Ordered Solutions for Dynamic Programs , 1987, Math. Oper. Res..
[256] S. N. Afriat,et al. Logic of choice and economic theory , 1987 .
[257] Ioannis Karatzas,et al. Brownian Motion and Stochastic Calculus , 1987 .
[258] G. Moustakides. Optimal stopping times for detecting changes in distributions , 1986 .
[259] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[260] T. Nakai. The problem of optimal stopping in a partially observable Markov chain , 1985 .
[261] H. Robbins,et al. Asymptotically efficient adaptive allocation rules , 1985 .
[262] S. Marcus. Algebraic and Geometric Methods in Nonlinear Filtering , 1984 .
[263] James N. Eagle. The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..
[264] Harold J. Kushner,et al. Approximation and Weak Convergence Methods for Random Processes , 1984 .
[265] Lennart Ljung,et al. Theory and Practice of Recursive Identification , 1983 .
[266] Valerie Isham,et al. Non‐Negative Matrices and Markov Chains , 1983 .
[267] New York Dover,et al. ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .
[268] H. Varian. Non-parametric Tests of Consumer Behaviour , 1983 .
[269] W. Whitt. Multivariate monotone likelihood ratio and uniform conditional stochastic order , 1982, Journal of Applied Probability.
[270] R. Shumway,et al. AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .
[271] H. Varian. The Nonparametric Approach to Demand Analysis , 1982 .
[272] B. Anderson,et al. Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.
[273] T. Louis. Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .
[274] É. Pardoux,et al. quations du filtrage non linaire de la prdiction et du lissage , 1982 .
[275] Daniel P. Heyman,et al. Stochastic models in operations research , 1982 .
[276] V. Benes. Exact finite-dimensional filters for certain diffusions with nonlinear drift , 1981 .
[277] S. Karlin,et al. A second course in stochastic processes , 1981 .
[278] Paul R. Milgrom,et al. Good News and Bad News: Representation Theorems and Applications , 1981 .
[279] S. Karlin,et al. Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions , 1980 .
[280] C. White,et al. Application of Jensen's inequality to adaptive suboptimal design , 1980 .
[281] Mark H. A. Davis. On a multiplicative functional transformation arising in nonlinear filtering theory , 1980 .
[282] Harold J. Kushner,et al. A Robust Discrete State Approximation to the Optimal Nonlinear Filter for a Diffusion. , 1980 .
[283] P. Whittle. Multi‐Armed Bandits and the Gittins Index , 1980 .
[284] Thomas Kailath,et al. Linear Systems , 1980 .
[285] P. Billingsley,et al. Probability and Measure , 1980 .
[286] Uriel G. Rothblum,et al. Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..
[287] S. Christian Albright,et al. Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..
[288] Mark S. Granovetter. Threshold Models of Collective Behavior , 1978, American Journal of Sociology.
[289] Donald M. Topkis,et al. Minimizing a Submodular Function on a Lattice , 1978, Oper. Res..
[290] Alan S. Willsky,et al. Algebraic Structure and Finite Dimensional Nonlinear Estimation , 1978 .
[291] J. Vial,et al. Strategically zero-sum games: The class of games whose completely mixed equilibria cannot be improved upon , 1978 .
[292] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[293] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[294] J. Keilson,et al. Monotone matrices and monotone Markov processes , 1977 .
[295] Lennart Ljung,et al. Analysis of recursive stochastic algorithms , 1977 .
[296] C. Derman,et al. Optimal System Allocations with Penalty Costs , 1976 .
[297] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[298] G. Lorden. PROCEDURES FOR REACTING TO A CHANGE IN DISTRIBUTION , 1971 .
[299] Ronald A. Howard,et al. Dynamic Probabilistic Systems , 1971 .
[300] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[301] Stephen M. Pollock,et al. A Simple Model of Search for a Moving Target , 1970, Oper. Res..
[302] T. Cover,et al. Learning with Finite Memory , 1970 .
[303] P. Billingsley,et al. Convergence of Probability Measures , 1970, The Mathematical Gazette.
[304] A. Jazwinski. Stochastic Processes and Filtering Theory , 1970 .
[305] Thomas M. Cover,et al. The two-armed-bandit problem with time-invariant finite memory , 1970, IEEE Trans. Inf. Theory.
[306] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[307] M. Zakai. On the optimal filtering of diffusion processes , 1969 .
[308] D. Mayne,et al. Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering† , 1969 .
[309] S. Ross. Arbitrary State Markovian Decision Processes , 1968 .
[310] D. Luenberger. Optimization by Vector Space Methods , 1968 .
[311] J. Peschon,et al. Optimal control of measurement subsystems , 1967, IEEE Transactions on Automatic Control.
[312] H. Kushner. Dynamical equations for optimal nonlinear filtering , 1967 .
[313] S. Afriat. THE CONSTRUCTION OF UTILITY FUNCTIONS FROM EXPENDITURE DATA , 1967 .
[314] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .
[315] R. Bellman. Dynamic Programming , 1957, Science.
[316] E. Dynkin. Controlled Random Sequences , 1965 .
[317] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[318] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .
[319] W. Wonham. Some applications of stochastic difierential equations to optimal nonlinear ltering , 1964 .
[320] W. Rudin. Principles of mathematical analysis , 1964 .
[321] R. E. Kalman,et al. When Is a Linear Control System Optimal , 1964 .
[322] A. Shiryaev. On Optimum Methods in Quickest Detection Problems , 1963 .
[323] A. N. Kolmogorov,et al. Interpolation and extrapolation of stationary random sequences. , 1962 .
[324] P. Billingsley,et al. Statistical inference for Markov processes , 1961 .
[325] R. E. Kalman,et al. New Results in Linear Filtering and Prediction Theory , 1961 .
[326] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[327] R. L. Stratonovich. CONDITIONAL MARKOV PROCESSES , 1960 .
[328] A. Wald. Note on the Consistency of the Maximum Likelihood Estimate , 1949 .