Nonmyopic Multiaspect Sensing With Partially Observable Markov Decision Processes

We consider the problem of sensing a concealed or distant target by interrogation from multiple sensors situated on a single platform. The available actions that may be taken are selection of the next relative target-platform orientation and the next sensor to be deployed. The target is modeled in terms of a set of states, each state representing a contiguous set of target-sensor orientations over which the scattering physics is relatively stationary. The sequence of states sampled at multiple target-sensor orientations may be modeled as a Markov process. The sensor only has access to the scattered fields, without knowledge of the particular state being sampled, and, therefore, the problem is modeled as a partially observable Markov decision process (POMDP). The POMDP yields a policy, in which the belief state at any point is mapped to a corresponding action. The nonmyopic policy is compared to an approximate myopic approach, with example results presented for measured underwater acoustic scattering data

[1]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[2]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[3]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[4]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[5]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[6]  Frank P. Ferrie,et al.  Autonomous exploration: driven by uncertainty , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[8]  Lawrence Carin,et al.  Matching pursuits with a wave-based dictionary , 1997, IEEE Trans. Signal Process..

[9]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[10]  D. Castañón Approximate dynamic programming for sensor management , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.

[11]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[12]  Ronald E. Parr,et al.  Hierarchical control and learning for markov decision processes , 1998 .

[13]  S. Mallat A wavelet tour of signal processing , 1998 .

[14]  Lawrence Carin,et al.  Multiaspect identification of submerged elastic targets via wave-based matching pursuits and hidden , 1999 .

[15]  Lawrence Carin,et al.  Hidden Markov models for multiaspect target classification , 1999, IEEE Trans. Signal Process..

[16]  Robin J. Evans,et al.  Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking , 2001, IEEE Trans. Signal Process..

[17]  Leonidas J. Guibas,et al.  Sensing, tracking and reasoning with relations , 2002, IEEE Signal Process. Mag..

[18]  Neri Merhav,et al.  Hidden Markov processes , 2002, IEEE Trans. Inf. Theory.

[19]  Vikram Krishnamurthy,et al.  Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..

[20]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[21]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[22]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[23]  Lawrence Carin,et al.  Detection of buried targets via active selection of labeled data: application to sensing subsurface UXO , 2004, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Lawrence Carin,et al.  Application of the theory of optimal experiments to adaptive electromagnetic-induction sensing of buried targets , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Robin J. Evans,et al.  Networked sensor management and data rate control for tracking maneuvering targets , 2005, IEEE Transactions on Signal Processing.

[26]  D.A. Castanon,et al.  Stochastic Control Bounds on Sensor Network Performance , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[27]  V. Krishnamurthy Emission management for low probability intercept sensors in network centric warfare , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[28]  Alfred O. Hero,et al.  Sensor management using an active sensing approach , 2005, Signal Process..