Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing

9 5 5 partially observable decision processes 9 5. partially observable total cost markov decision processes. partially observed markov decision processes a survey 1991. partially observable markov decision processes for spoken. factored partially observable markov decision processes. partially observed markov decision processes from. reinforcement learning algorithm for partially observable. customer reviews partially observed markov. markov model. partially observable markov decision process. the pomdp page. partially observable markov decision processes pomdps. information relaxation bounds for partially observed. partially observable markov decision processes with. ece 586 markov decision processes and reinforcement. partially observed markov decision process multiarmed. information relaxation bounds for partially observed. partially observable markov decision processes pomdps. stochastic state space models chapter 2 partially. approximations for partially observed markov decision. ece 555 control of stochastic systems spring 2019. krishnamurthy vikram partially observed markov decision. vikram krishnamurthy cornell engineering. partially observed markov decision processes from. partially observable markov decision process. vikram krishnamurthy electrical and puter engineering. active sensing pomdps cornell university. a tutorial on partially observable markov decision processes. vikram krishnamurthy moves to ithaca campus electrical. google sites sign in. part 5 partially observed markov decision processes. partially observed markov decision processes from. matthijs spaan institute for systems and robotics. partially observed markov decision processes problem sets. ece 6950 pomdps and controlled sensing. partially observed markov decision process multiarmed. a partially observed markov decision process for dynamic. optimal control of partially observable piecewise. convex stochastic dominance in bayesian localization. partially observed markov decision processes vikram. partially observed markov decision processes by vikram. 100 best markov decision process videos meta guide. 1608 07793 partially observable markov decision process. a semi markov decision model for recognizing the. the infinite partially observable markov decision process. a partially observed markov decision process for dynamic. finite model approximations for partially observed markov. controlled stochastic process encyclopedia of mathematics. from filtering to controlled sensing

[1]  Bo Wahlberg,et al.  Recursive identification of chain dynamics in Hidden Markov Models using Non-Negative Matrix Factorization , 2015, 2015 54th IEEE Conference on Decision and Control (CDC).

[2]  Vikram Krishnamurthy,et al.  Myopic Bounds for Optimal Policy of POMDPs: An Extension of Lovejoy's Structural Results , 2014, Oper. Res..

[3]  Vikram Krishnamurthy,et al.  Online Reputation and Polling Systems: Data Incest, Social Learning, and Revealed Preferences , 2014, IEEE Transactions on Computational Social Systems.

[4]  Cristian R. Rojas,et al.  Reduced Complexity HMM Filtering With Stochastic Dominance Bounds: A Convex Optimization Approach , 2014, IEEE Transactions on Signal Processing.

[5]  Ali Sayed,et al.  Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..

[6]  Vikram Krishnamurthy,et al.  Interactive Sensing and Decision Making in Social Networks , 2014, Found. Trends Signal Process..

[7]  Nicole Bäuerle,et al.  More Risk-Sensitive Markov Decision Processes , 2014, Math. Oper. Res..

[8]  Özlem Çavus,et al.  Risk-Averse Control of Undiscounted Transient Markov Models , 2012, SIAM J. Control. Optim..

[9]  Fernando Vega-Redondo,et al.  Complex Social Networks: Searching in Social Networks , 2007 .

[10]  Bo Wahlberg,et al.  Computing monotone policies for Markov decision processes by exploiting sparsity , 2013, 2013 Australian Control Conference.

[11]  Vikram Krishnamurthy How to Schedule Measurements of a Noisy Markov Chain in Decision Making? , 2013, IEEE Transactions on Information Theory.

[12]  H. Vincent Poor,et al.  Social learning and bayesian games in multiagent signal processing: how do local and global decision makers interact? , 2013, IEEE Signal Processing Magazine.

[13]  Anna Scaglione,et al.  Models for the Diffusion of Beliefs in Social Networks: An Overview , 2013, IEEE Signal Processing Magazine.

[14]  Gang George Yin,et al.  Distributed Tracking of Correlated Equilibria in Regime Switching Noncooperative Games , 2013, IEEE Transactions on Automatic Control.

[15]  Langford B. White,et al.  Maximum Likelihood Sequence Estimation for Hidden Reciprocal Processes , 2013, IEEE Transactions on Automatic Control.

[16]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[17]  Tal Ben-Zvi,et al.  Partially Observed Markov Decision Processes with Binomial Observations , 2013, Oper. Res. Lett..

[18]  Vikram Krishnamurthy,et al.  Detection of Anomalous Trajectory Patterns in Target Tracking via Stochastic Context-Free Grammars and Reciprocal Process Models , 2013, IEEE Journal of Selected Topics in Signal Processing.

[19]  Milica Gasic,et al.  POMDP-Based Statistical Spoken Dialog Systems: A Review , 2013, Proceedings of the IEEE.

[20]  Michel Benaïm,et al.  Consistency of Vanishingly Smooth Fictitious Play , 2011, Math. Oper. Res..

[21]  Guy Shani,et al.  Noname manuscript No. (will be inserted by the editor) A Survey of Point-Based POMDP Solvers , 2022 .

[22]  Evangelos Markakis,et al.  A Game-Theoretic Analysis of a Competitive Diffusion Process over Social Networks , 2012, WINE.

[23]  Cameron Marlow,et al.  A 61-million-person experiment in social influence and political mobilization , 2012, Nature.

[24]  Bruno Strulovici,et al.  Aggregating the single crossing property , 2012 .

[25]  Samuel N. Cohen,et al.  Stochastic Processes, Finance And Control: A Festschrift In Honor Of Robert J Elliott , 2012 .

[26]  Hal R. Varian,et al.  Revealed Preference and its Applications , 2012 .

[27]  Tara Javidi,et al.  Active Sequential Hypothesis Testing , 2012, ArXiv.

[28]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[29]  Aleksey S. Polunchenko,et al.  State-of-the-Art in Sequential Change-Point Detection , 2011, 1109.2938.

[30]  JOHN K.-H. QUAH,et al.  AGGREGATING THE SINGLE CROSSING PROPERTY BY JOHN K.-H. QUAH , 2012 .

[31]  Maxim Raginsky,et al.  Shannon meets Blackwell and Le Cam: Channels, codes, and statistical experiments , 2011, 2011 IEEE International Symposium on Information Theory Proceedings.

[32]  Vikram Krishnamurthy,et al.  Bayesian Sequential Detection With Phase-Distributed Change Time and Nonlinear Penalty—A POMDP Lattice Programming Approach , 2011, IEEE Transactions on Information Theory.

[33]  N. Higham,et al.  On pth Roots of Stochastic Matrices , 2011 .

[34]  Filip Matejka,et al.  Rational Inattention to Discrete Choices: A New Foundation for the Multinomial Logit Model , 2011 .

[35]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[36]  Taposh Banerjee,et al.  Data-Efficient Quickest Change Detection with On–Off Observation Control , 2011, ArXiv.

[37]  Ba Di Ya,et al.  Matrix Analysis , 2011 .

[38]  Boleslaw K. Szymanski,et al.  Social consensus through the influence of committed minorities , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Asuman E. Ozdaglar,et al.  Opinion Dynamics and Learning in Social Networks , 2010, Dyn. Games Appl..

[40]  Dimitri P. Bertsekas,et al.  Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, 49th IEEE Conference on Decision and Control (CDC).

[41]  Andrzej Ruszczynski,et al.  Risk-averse dynamic programming for Markov decision processes , 2010, Math. Program..

[42]  G. Moustakides,et al.  State-of-the-Art in Bayesian Changepoint Detection , 2010 .

[43]  Emmanuel J. Candès,et al.  The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.

[44]  Qing Zhao,et al.  Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access , 2008, IEEE Transactions on Information Theory.

[45]  Vikram Krishnamurthy,et al.  Monotonicity of Constrained Optimal Transmission Policies in Correlated Fading Channels With ARQ , 2010, IEEE Transactions on Signal Processing.

[46]  Vikram Krishnamurthy,et al.  Optimal Threshold Policies for Multivariate POMDPs in Radar Resource Management , 2009, IEEE Transactions on Signal Processing.

[47]  Sunghee Lee Understanding Respondent Driven Sampling from a Total Survey Error Perspective , 2009 .

[48]  Vikram Krishnamurthy,et al.  Optimality of threshold policies for transmission scheduling in correlated fading channels , 2009, IEEE Transactions on Communications.

[49]  Lieven Vandenberghe,et al.  Interior-Point Method for Nuclear Norm Approximation with Application to System Identification , 2009, SIAM J. Matrix Anal. Appl..

[50]  Matthew J. Salganik,et al.  Respondent‐driven sampling as Markov chain Monte Carlo , 2009, Statistics in medicine.

[51]  Hector Geffner,et al.  A Translation-Based Approach to Contingent Planning , 2009, IJCAI.

[52]  Venugopal V. Veeravalli,et al.  Bayesian quickest change process detection , 2009, 2009 IEEE International Symposium on Information Theory.

[53]  Michael L. Littman,et al.  A tutorial on partially observable Markov decision processes , 2009 .

[54]  S. Haykin,et al.  Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[55]  Bo Wahlberg,et al.  Partially Observed Markov Decision Process Multiarmed Bandits - Structural Results , 2009, Math. Oper. Res..

[56]  Gang George Yin,et al.  How does a stochastic optimization/approximation algorithm adapt to a randomly evolving optimum/root with jump Markov sample paths , 2009, Math. Program..

[57]  Mirjam Dür,et al.  An Adaptive Linear Approximation Algorithm for Copositive Programs , 2009, SIAM J. Optim..

[58]  Sham M. Kakade,et al.  A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..

[59]  Dimitri P. Bertsekas,et al.  Neuro-Dynamic Programming , 2009, Encyclopedia of Optimization.

[60]  Luca Maria Gambardella,et al.  A survey on metaheuristics for stochastic combinatorial optimization , 2009, Natural Computing.

[61]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[62]  Ali H. Sayed,et al.  Adaptive Filters , 2008 .

[63]  Mirjam Dür,et al.  Algorithmic copositivity detection by simplicial partition , 2008 .

[64]  Dunia López-Pintado,et al.  Diffusion in complex social networks , 2008, Games Econ. Behav..

[65]  K. Ramanan,et al.  Concentration Inequalities for Dependent Random Variables via the Martingale Method , 2006, math/0609835.

[66]  Tsachy Weissman,et al.  Universal Filtering Via Hidden Markov Modeling , 2008, IEEE Transactions on Information Theory.

[67]  H. Vincent Poor,et al.  Quickest Detection: Probabilistic framework , 2008 .

[68]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[69]  Bruno Strulovici,et al.  Comparative Statics, Informativeness, and the Interval Dominance Order , 2009 .

[70]  Peter W. Glynn,et al.  Proceedings of the 2nd international conference on Performance evaluation methodologies and tools , 2007 .

[71]  Abraham Grosfeld-Nir,et al.  Control limits for two-state partially observable Markov decision processes , 2007, Eur. J. Oper. Res..

[72]  Vikram Krishnamurthy,et al.  Structured Threshold Policies for Dynamic Sensor Scheduling—A Partially Observed Markov Decision Process Approach , 2007, IEEE Transactions on Signal Processing.

[73]  Leonard Rogers,et al.  VALUATIONS AND DYNAMIC CONVEX RISK MEASURES , 2007, 0709.0232.

[74]  John W. Fisher,et al.  Approximate Dynamic Programming for Communication-Constrained Sensor Network Management , 2007, IEEE Transactions on Signal Processing.

[75]  Vikram Krishnamurthy,et al.  Decentralized Activation in a ZigBee-enabled Unattended Ground Sensor Network: A Correlated Equilibrium Game Theoretic Analysis , 2007, 2007 IEEE International Conference on Communications.

[76]  Subhrakanti Dey,et al.  Stability of Kalman filtering with Markovian packet losses , 2007, Autom..

[77]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[78]  Dimitri P. Bertsekas,et al.  Stochastic optimal control : the discrete time case , 2007 .

[79]  Guy Shani,et al.  Forward Search Value Iteration for POMDPs , 2007, IJCAI.

[80]  A. Lansky,et al.  Developing an HIV Behavioral Surveillance System for Injecting Drug Users: The National HIV Behavioral Surveillance System , 2007, Public health reports.

[81]  George E. Monahan,et al.  A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .

[82]  Vikram Krishnamurthy,et al.  Opportunistic file transfer over a fading channel: A POMDP search theory formulation with optimal threshold policies , 2006, IEEE Transactions on Wireless Communications.

[83]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..

[84]  L. Platzman Optimal Infinite-Horizon Undiscounted Control of Finite Probabilistic Systems , 2006 .

[85]  Ari Arapostathis,et al.  On the existence of stationary optimal policies for partially observed MDPs under the long-run average cost criterion , 2006, Syst. Control. Lett..

[86]  Lones Smith,et al.  Informational Herding and Optimal Experimentation , 2006 .

[87]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[88]  Geoffrey J. Gordon,et al.  Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[89]  Eric Moulines,et al.  Inference in hidden Markov models , 2010, Springer series in statistics.

[90]  S. Ethier,et al.  Markov Processes: Characterization and Convergence , 2005 .

[91]  Yaakov Oshman,et al.  A Crame/spl acute/r-Rao-type estimation lower bound for systems with measurement faults , 2005, IEEE Transactions on Automatic Control.

[92]  Gang George Yin,et al.  LMS algorithms for tracking slow Markov chains with applications to hidden Markov estimation and adaptive multiuser detection , 2005, IEEE Transactions on Information Theory.

[93]  Xin Guo,et al.  On the optimality of conditional expectation as a Bregman predictor , 2005, IEEE Trans. Inf. Theory.

[94]  Josef Hofbauer,et al.  Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..

[95]  Nikos A. Vlassis,et al.  Perseus: Randomized Point-based Value Iteration for POMDPs , 2005, J. Artif. Intell. Res..

[96]  Robin J. Evans,et al.  Networked sensor management and data rate control for tracking maneuvering targets , 2005, IEEE Transactions on Signal Processing.

[97]  L. Vesterlund,et al.  Dynamic Monopoly Pricing and Herding , 2005 .

[98]  Simon Haykin,et al.  Cognitive radio: brain-empowered wireless communications , 2005, IEEE Journal on Selected Areas in Communications.

[99]  Shlomo Shamai,et al.  Mutual information and minimum mean-square error in Gaussian channels , 2004, IEEE Transactions on Information Theory.

[100]  V. Veeravalli,et al.  General Asymptotic Bayesian Theory of Quickest Change Detection , 2005 .

[101]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Suboptimal Control: A Survey from ADP to MPC , 2005, Eur. J. Control.

[102]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[103]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[104]  G. Yin,et al.  Discrete-Time Markov Chains: Two-Time-Scale Methods and Applications , 2004 .

[105]  R. Douc,et al.  Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime , 2004, math/0503681.

[106]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[107]  P. Moral Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications , 2004 .

[108]  Branko Ristic,et al.  Beyond the Kalman Filter: Particle Filters for Tracking Applications , 2004 .

[109]  Pierre Hansen,et al.  On the geometry of Nash equilibria and correlated equilibria , 2003, Int. J. Game Theory.

[110]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[111]  P. Kantor Foundations of Statistical Natural Language Processing , 2001, Information Retrieval.

[112]  Gang George Yin,et al.  Regime Switching Stochastic Approximation Algorithms with Application to Adaptive Discrete Stochastic Optimization , 2004, SIAM J. Optim..

[113]  Stephen P. Boyd,et al.  Fastest Mixing Markov Chain on a Graph , 2004, SIAM Rev..

[114]  Eitan Altman,et al.  Discrete-Event Control of Stochastic Networks - Multimodularity and Regularity , 2004, Lecture notes in mathematics.

[115]  R. Amir Supermodularity and Complementarity in Economics: An Elementary Survey , 2003 .

[116]  C. Chamley Rational Herds: Economic Models of Social Learning , 2003 .

[117]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[118]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[119]  Luc Vandendorpe,et al.  Turbo synchronization: an EM algorithm interpretation , 2003, IEEE International Conference on Communications, 2003. ICC '03..

[120]  M. Benaïm,et al.  Deterministic Approximation of Stochastic Evolution in Games , 2003 .

[121]  M. J. Todd,et al.  Two new proofs of Afriat’s theorem , 2003 .

[122]  Konstantinos V. Katsikopoulos,et al.  Markov decision processes with delays and asynchronous cost collection , 2003, IEEE Trans. Autom. Control..

[123]  Stéphane Boucheron,et al.  Optimal error exponents in hidden Markov models order estimation , 2003, IEEE Trans. Inf. Theory.

[124]  C. Sims Implications of rational inattention , 2003 .

[125]  Vikram Krishnamurthy,et al.  The optimal search for a Markovian target when the search path is constrained: the infinite-horizon case , 2003, IEEE Trans. Autom. Control..

[126]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[127]  Frank Riedel,et al.  Dynamic Coherent Risk Measures , 2003 .

[128]  A. Cassandra A Survey of POMDP Applications , 2003 .

[129]  William H. Sandholm,et al.  ON THE GLOBAL CONVERGENCE OF STOCHASTIC FICTITIOUS PLAY , 2002 .

[130]  James E. Smith,et al.  Structural Properties of Stochastic Dynamic Programs , 2002, Oper. Res..

[131]  Douglas Aberdeen,et al.  Scalable Internal-State Policy-Gradient Methods for POMDPs , 2002, ICML.

[132]  Vikram Krishnamurthy,et al.  Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..

[133]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[134]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[135]  A. Müller,et al.  Comparison Methods for Stochastic Models and Risks , 2002 .

[136]  Douglas D. Heckathorn,et al.  Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hi , 2002 .

[137]  S. Athey Monotone Comparative Statics under Uncertainty , 2002 .

[138]  Peter L. Bartlett,et al.  Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning , 2000, J. Comput. Syst. Sci..

[139]  Eugene A. Feinberg,et al.  Handbook of Markov Decision Processes , 2002 .

[140]  George Casella,et al.  Implementations of the Monte Carlo EM Algorithm , 2001 .

[141]  P. Sørensen,et al.  Information aggregation in debate: who should speak first? ☆ , 2001 .

[142]  Ravindra K. Ahuja,et al.  Inverse Optimization , 2001, Oper. Res..

[143]  S. Morris,et al.  Global Games: Theory and Applications , 2001 .

[144]  Thiagalingam Kirubarajan,et al.  Estimation with Applications to Tracking and Navigation , 2001 .

[145]  Arnaud Doucet,et al.  Particle filters for state estimation of jump Markov linear systems , 2001, IEEE Trans. Signal Process..

[146]  Vivek S. Borkar,et al.  Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..

[147]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[148]  Alessandro Vespignani,et al.  Epidemic spreading in scale-free networks. , 2000, Physical review letters.

[149]  Jun S. Liu,et al.  Monte Carlo strategies in scientific computing , 2001 .

[150]  H. V. Trees Detection, Estimation, And Modulation Theory , 2001 .

[151]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[152]  Jinwen Ma,et al.  Asymptotic Convergence Rate of the EM Algorithm for Gaussian Mixtures , 2000, Neural Computation.

[153]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[154]  Milos Hauskrecht,et al.  Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[155]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[156]  David Q. Mayne,et al.  Constrained model predictive control: Stability and optimality , 2000, Autom..

[157]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[158]  R. Rockafellar,et al.  Optimization of conditional value-at risk , 2000 .

[159]  Moshe Pollak,et al.  Detecting a change in regression: first-order optimality , 1999 .

[160]  Sigrún Andradóttir,et al.  Accelerating the convergence of random search methods for discrete stochastic optimization , 1999, TOMC.

[161]  Christos G. Cassandras,et al.  Introduction to Discrete Event Systems , 1999, The Kluwer International Series on Discrete Event Dynamic Systems.

[162]  Samuel S. Blackman,et al.  Design and Analysis of Modern Tracking Systems , 1999 .

[163]  Vikram Krishnamurthy,et al.  Expectation maximization algorithms for MAP estimation of jump Markov linear systems , 1999, IEEE Trans. Signal Process..

[164]  Philippe Artzner,et al.  Coherent Measures of Risk , 1999 .

[165]  O. Hernández-Lerma,et al.  Discrete-time Markov control processes , 1999 .

[166]  E. Altman Constrained Markov Decision Processes , 1999 .

[167]  Alf Isaksson,et al.  On sensor scheduling via information theoretic criteria , 1999, Proceedings of the 1999 American Control Conference (Cat. No. 99CH36251).

[168]  J. Booth,et al.  Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm , 1999 .

[169]  A Orman,et al.  Optimization of Stochastic Models: The Interface Between Simulation and Optimization , 2012, J. Oper. Res. Soc..

[170]  H. Poor Quickest detection with exponential penalty for delay , 1998 .

[171]  L. Sennott Stochastic Dynamic Programming and the Control of Queueing Systems , 1998 .

[172]  Sergio Verdu,et al.  Multiuser Detection , 1998 .

[173]  Carlos H. Muravchik,et al.  Posterior Cramer-Rao bounds for discrete-time nonlinear filtering , 1998, IEEE Trans. Signal Process..

[174]  D. M. Topkis Supermodularity and Complementarity , 1998 .

[175]  D. Fudenberg,et al.  The Theory of Learning in Games , 1998 .

[176]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[177]  A. Cassandra,et al.  Exact and approximate algorithms for partially observable markov decision processes , 1998 .

[178]  F. Gland,et al.  Exponential Forgetting and Geometric Ergodicity in Hidden Markov Models , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[179]  Alfred Müller,et al.  How Does the Value Function of a Markov Decision Process Depend on the Transition Probabilities? , 1997, Math. Oper. Res..

[180]  Robert J. Elliott,et al.  Exact Finite-Dimensional Filters for Maximum Likelihood Parameter Estimation of Continuous-time Linear Gaussian Systems , 1997 .

[181]  B. M. Pötscher,et al.  Dynamic Nonlinear Econometric Models: Asymptotic Theory , 1997 .

[182]  Douglas D. Heckathorn,et al.  Respondent-driven sampling : A new approach to the study of hidden populations , 1997 .

[183]  R. Atar,et al.  Lyapunov Exponents for Finite State Nonlinear Filtering , 1997 .

[184]  Masaaki Kijima,et al.  Markov processes for stochastic modeling , 1997 .

[185]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[186]  W. Chiou A note on estimation algebras on nonlinear filtering theory , 1996 .

[187]  Demosthenis Teneketzis,et al.  Measurement scheduling for recursive team estimation , 1996 .

[188]  Sigrún Andradóttir,et al.  A Global Search Method for Discrete Stochastic Optimization , 1996, SIAM J. Optim..

[189]  Stephen P. Boyd,et al.  Semidefinite Programming , 1996, SIAM Rev..

[190]  Vikram Krishnamurthy,et al.  Time discretization of continuous-time filters and smoothers for HMM parameter estimation , 1996, IEEE Trans. Inf. Theory.

[191]  J. Doyle,et al.  Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.

[192]  Thomas F. Coleman,et al.  An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds , 1993, SIAM J. Optim..

[193]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[194]  Michael L. Littman,et al.  Algorithms for Sequential Decision Making , 1996 .

[195]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[196]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[197]  Stuart J. Russell,et al.  Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.

[198]  Luca Maria Gambardella,et al.  Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem , 1995, ICML.

[199]  D. Fudenberg,et al.  Consistency and Cautious Fictitious Play , 1995 .

[200]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[201]  Benjamin Paul Jordan On optimal search for a moving target , 1995 .

[202]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[203]  John B. Moore,et al.  Hidden Markov Models: Estimation and Control , 1994 .

[204]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[205]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[206]  Alfred O. Hero,et al.  Space-alternating generalized expectation-maximization algorithm , 1994, IEEE Trans. Signal Process..

[207]  James D. Hamilton,et al.  Autoregressive conditional heteroskedasticity and changes in regime , 1994 .

[208]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[209]  Michael Pinedo,et al.  Scheduling: Theory, Algorithms, and Systems , 1994 .

[210]  Xuan Kong,et al.  Adaptive Signal Processing Algorithms: Stability and Performance , 1994 .

[211]  Rudi Zagst,et al.  Monotonicity and bounds for convex stochastic control models , 1994, Math. Methods Oper. Res..

[212]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[213]  M. James,et al.  Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems , 1994, IEEE Trans. Autom. Control..

[214]  Xiao-Li Meng,et al.  On the rate of convergence of the ECM algorithm , 1994 .

[215]  H. Poor An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[216]  Paul R. Milgrom,et al.  Monotone Comparative Statics , 1994 .

[217]  D. Sworder,et al.  Image-enhanced estimation methods , 1993, Proc. IEEE.

[218]  William S. Lovejoy,et al.  Suboptimal Policies, with Bounds, for Parameter Adaptive Decision Processes , 1993, Oper. Res..

[219]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[220]  M. K. Ghosh,et al.  Discrete-time controlled Markov processes with average cost criterion: a survey , 1993 .

[221]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[222]  W. Fleming,et al.  Controlled Markov processes and viscosity solutions , 1992 .

[223]  T. Lindvall Lectures on the Coupling Method , 1992 .

[224]  A. Bensoussan Stochastic Control of Partially Observable Systems , 1992 .

[225]  A. Banerjee,et al.  A Simple Model of Herd Behavior , 1992 .

[226]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[227]  S. Bikhchandani,et al.  You have printed the following article : A Theory of Fads , Fashion , Custom , and Cultural Change as Informational Cascades , 2007 .

[228]  Koji Iida,et al.  Studies on the Optimal Search Plan , 1992 .

[229]  B. Leroux Maximum-likelihood estimation for hidden Markov models , 1992 .

[230]  Ulrich Rieder,et al.  Structural results for partially observed control models , 1991, ZOR Methods Model. Oper. Res..

[231]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[232]  B. Conolly Structured Stochastic Matrices of M/G/1 Type and Their Applications , 1991 .

[233]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[234]  D. W. Lewis Matrix theory , 1991 .

[235]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[236]  G. Barles,et al.  Convergence of approximation schemes for fully nonlinear second order equations , 1990, 29th IEEE Conference on Decision and Control.

[237]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[238]  B. Anderson,et al.  Optimal control: linear quadratic methods , 1990 .

[239]  J. Bather,et al.  Multi‐Armed Bandit Allocation Indices , 1990 .

[240]  J. Mendel,et al.  Maximum-Likelihood Deconvolution: A Journey into Model-Based Signal Processing , 1990 .

[241]  A. Bensoussan,et al.  Optimal sensor scheduling in nonlinear filtering of diffusion processes , 1989 .

[242]  Lawrence D. Stone OR Forum - What's Happened in Search Theory Since the 1975 Lanchester Prize? , 1989, Oper. Res..

[243]  D. Ghosh Maximum likelihood estimation of the dynamic shock-error model , 1989 .

[244]  E. Weinstein,et al.  A new method for evaluating the log-likelihood gradient, the Hessian, and the Fisher information matrix for linear dynamic systems , 1989, IEEE Trans. Inf. Theory.

[245]  Jerzy A. Filar,et al.  Variance-Penalized Markov Decision Processes , 1989, Math. Oper. Res..

[246]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[247]  E. Hannan,et al.  The statistical theory of linear systems , 1989 .

[248]  P. Caines Linear Stochastic Systems , 1988 .

[249]  R. Mohler,et al.  Nonlinear data observability and information , 1988 .

[250]  J. George Shanthikumar,et al.  DFR Property of First-Passage Times and its Preservation Under Geometric Compounding , 1988 .

[251]  Petre Stoica,et al.  Decentralized Control , 2018, The Control Systems Handbook.

[252]  William S. Lovejoy,et al.  Some Monotonicity Results for Partially Observed Markov Decision Processes , 1987, Oper. Res..

[253]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[254]  C. N. Morris,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[255]  William S. Lovejoy Ordered Solutions for Dynamic Programs , 1987, Math. Oper. Res..

[256]  S. N. Afriat,et al.  Logic of choice and economic theory , 1987 .

[257]  Ioannis Karatzas,et al.  Brownian Motion and Stochastic Calculus , 1987 .

[258]  G. Moustakides Optimal stopping times for detecting changes in distributions , 1986 .

[259]  Pravin Varaiya,et al.  Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .

[260]  T. Nakai The problem of optimal stopping in a partially observable Markov chain , 1985 .

[261]  H. Robbins,et al.  Asymptotically efficient adaptive allocation rules , 1985 .

[262]  S. Marcus Algebraic and Geometric Methods in Nonlinear Filtering , 1984 .

[263]  James N. Eagle The Optimal Search for a Moving Target When the Search Path Is Constrained , 1984, Oper. Res..

[264]  Harold J. Kushner,et al.  Approximation and Weak Convergence Methods for Random Processes , 1984 .

[265]  Lennart Ljung,et al.  Theory and Practice of Recursive Identification , 1983 .

[266]  Valerie Isham,et al.  Non‐Negative Matrices and Markov Chains , 1983 .

[267]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[268]  H. Varian Non-parametric Tests of Consumer Behaviour , 1983 .

[269]  W. Whitt Multivariate monotone likelihood ratio and uniform conditional stochastic order , 1982, Journal of Applied Probability.

[270]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[271]  H. Varian The Nonparametric Approach to Demand Analysis , 1982 .

[272]  B. Anderson,et al.  Optimal Filtering , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[273]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[274]  É. Pardoux,et al.  quations du filtrage non linaire de la prdiction et du lissage , 1982 .

[275]  Daniel P. Heyman,et al.  Stochastic models in operations research , 1982 .

[276]  V. Benes Exact finite-dimensional filters for certain diffusions with nonlinear drift , 1981 .

[277]  S. Karlin,et al.  A second course in stochastic processes , 1981 .

[278]  Paul R. Milgrom,et al.  Good News and Bad News: Representation Theorems and Applications , 1981 .

[279]  S. Karlin,et al.  Classes of orderings of measures and related correlation inequalities. I. Multivariate totally positive distributions , 1980 .

[280]  C. White,et al.  Application of Jensen's inequality to adaptive suboptimal design , 1980 .

[281]  Mark H. A. Davis On a multiplicative functional transformation arising in nonlinear filtering theory , 1980 .

[282]  Harold J. Kushner,et al.  A Robust Discrete State Approximation to the Optimal Nonlinear Filter for a Diffusion. , 1980 .

[283]  P. Whittle Multi‐Armed Bandits and the Gittins Index , 1980 .

[284]  Thomas Kailath,et al.  Linear Systems , 1980 .

[285]  P. Billingsley,et al.  Probability and Measure , 1980 .

[286]  Uriel G. Rothblum,et al.  Optimal stopping, exponential utility, and linear programming , 1979, Math. Program..

[287]  S. Christian Albright,et al.  Structural Results for Partially Observable Markov Decision Processes , 1979, Oper. Res..

[288]  Mark S. Granovetter Threshold Models of Collective Behavior , 1978, American Journal of Sociology.

[289]  Donald M. Topkis,et al.  Minimizing a Submodular Function on a Lattice , 1978, Oper. Res..

[290]  Alan S. Willsky,et al.  Algebraic Structure and Finite Dimensional Nonlinear Estimation , 1978 .

[291]  J. Vial,et al.  Strategically zero-sum games: The class of games whose completely mixed equilibria cannot be improved upon , 1978 .

[292]  Harold J. Kushner,et al.  wchastic. approximation methods for constrained and unconstrained systems , 1978 .

[293]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[294]  J. Keilson,et al.  Monotone matrices and monotone Markov processes , 1977 .

[295]  Lennart Ljung,et al.  Analysis of recursive stochastic algorithms , 1977 .

[296]  C. Derman,et al.  Optimal System Allocations with Penalty Costs , 1976 .

[297]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..

[298]  G. Lorden PROCEDURES FOR REACTING TO A CHANGE IN DISTRIBUTION , 1971 .

[299]  Ronald A. Howard,et al.  Dynamic Probabilistic Systems , 1971 .

[300]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[301]  Stephen M. Pollock,et al.  A Simple Model of Search for a Moving Target , 1970, Oper. Res..

[302]  T. Cover,et al.  Learning with Finite Memory , 1970 .

[303]  P. Billingsley,et al.  Convergence of Probability Measures , 1970, The Mathematical Gazette.

[304]  A. Jazwinski Stochastic Processes and Filtering Theory , 1970 .

[305]  Thomas M. Cover,et al.  The two-armed-bandit problem with time-invariant finite memory , 1970, IEEE Trans. Inf. Theory.

[306]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[307]  M. Zakai On the optimal filtering of diffusion processes , 1969 .

[308]  D. Mayne,et al.  Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering† , 1969 .

[309]  S. Ross Arbitrary State Markovian Decision Processes , 1968 .

[310]  D. Luenberger Optimization by Vector Space Methods , 1968 .

[311]  J. Peschon,et al.  Optimal control of measurement subsystems , 1967, IEEE Transactions on Automatic Control.

[312]  H. Kushner Dynamical equations for optimal nonlinear filtering , 1967 .

[313]  S. Afriat THE CONSTRUCTION OF UTILITY FUNCTIONS FROM EXPENDITURE DATA , 1967 .

[314]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[315]  R. Bellman Dynamic Programming , 1957, Science.

[316]  E. Dynkin Controlled Random Sequences , 1965 .

[317]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[318]  Norbert Wiener,et al.  Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[319]  W. Wonham Some applications of stochastic difierential equations to optimal nonlinear ltering , 1964 .

[320]  W. Rudin Principles of mathematical analysis , 1964 .

[321]  R. E. Kalman,et al.  When Is a Linear Control System Optimal , 1964 .

[322]  A. Shiryaev On Optimum Methods in Quickest Detection Problems , 1963 .

[323]  A. N. Kolmogorov,et al.  Interpolation and extrapolation of stationary random sequences. , 1962 .

[324]  P. Billingsley,et al.  Statistical inference for Markov processes , 1961 .

[325]  R. E. Kalman,et al.  New Results in Linear Filtering and Prediction Theory , 1961 .

[326]  R. E. Kalman,et al.  A New Approach to Linear Filtering and Prediction Problems , 2002 .

[327]  R. L. Stratonovich CONDITIONAL MARKOV PROCESSES , 1960 .

[328]  A. Wald Note on the Consistency of the Maximum Likelihood Estimate , 1949 .