Cost-Bounded Active Classification Using Partially Observable Markov Decision Processes

Active classification, i.e., the sequential decision making process aimed at data acquisition for classification purposes, arises naturally in many applications, including medical diagnosis, intrusion detection, and object tracking. In this work, we study the problem of actively classifying dynamical systems with a finite set of Markov decision process (MDP) models. We are interested in finding strategies that actively interact with the dynamical system, and observe its reactions so that the true model is determined efficiently with high confidence. To this end, we present a decision-theoretic framework based on partially observable Markov decision processes (POMDPs). The proposed framework relies on assigning a classification belief (a probability distribution) to each candidate MDP model. Given an initial belief, some misclassification probabilities, a cost bound, and a finite time horizon, we design POMDP strategies leading to classification decisions. We present two different approaches to find such strategies. The first approach computes the optimal strategy “exactly” using value iteration. To overcome the computational complexity of finding exact solutions, the second approach is based on adaptive sampling to approximate the optimal probability of reaching a classification decision. We illustrate the proposed methodology using two examples from medical diagnosis and intruder detection.

[1]  Hai Lin,et al.  Privacy Verification and Enforcement via Belief Abstraction , 2018, IEEE Control Systems Letters.

[2]  Thomas H. Morris,et al.  Applying Hoeffding Adaptive Trees for Real-Time Cyber-Power Event and Intrusion Classification , 2018, IEEE Transactions on Smart Grid.

[3]  Joost-Pieter Katoen,et al.  Discrete-Time Rewards Model-Checked , 2003, FORMATS.

[4]  Urbashi Mitra,et al.  Energy-Efficient, Heterogeneous Sensor Selection for Physical Activity Detection in Wireless Body Area Networks , 2013, IEEE Transactions on Signal Processing.

[5]  Lawrence Carin,et al.  Nonmyopic Multiaspect Sensing With Partially Observable Markov Decision Processes , 2007, IEEE Transactions on Signal Processing.

[6]  Wolfram Burgard,et al.  Active Markov localization for mobile robots , 1998, Robotics Auton. Syst..

[7]  Ronald E. Parr,et al.  Non-Myopic Multi-Aspect Sensing with Partially Observable Markov Decision Processes , 2005 .

[8]  Steven I. Marcus,et al.  Simulation-based Algorithms for Markov Decision Processes/ Hyeong Soo Chang ... [et al.] , 2013 .

[9]  Michael C. Fu,et al.  An Adaptive Sampling Algorithm for Solving Markov Decision Processes , 2005, Oper. Res..

[10]  Ufuk Topcu,et al.  Synthesis of Surveillance Strategies via Belief Abstraction , 2017, 2018 IEEE Conference on Decision and Control (CDC).

[11]  Jan J. M. M. Rutten,et al.  Mathematical techniques for analyzing concurrent and probabilistic systems , 2004, CRM monograph series.

[12]  Marta Z. Kwiatkowska,et al.  PRISM 4.0: Verification of Probabilistic Real-Time Systems , 2011, CAV.

[13]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[14]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[15]  Ufuk Topcu,et al.  Barrier Certificates for Assured Machine Teaching , 2018, 2019 American Control Conference (ACC).

[16]  Turgay Ayer,et al.  OR Forum - A POMDP Approach to Personalize Mammography Screening Decisions , 2012, Oper. Res..

[17]  M. Spaan Cooperative Active Perception using POMDPs , 2008 .

[18]  Geoffrey A. Hollinger,et al.  Active Classification: Theory and Application to Underwater Inspection , 2011, ISRR.

[19]  V Myers,et al.  A POMDP for multi-view target classification with an autonomous underwater vehicle , 2010, OCEANS 2010 MTS/IEEE SEATTLE.

[20]  Ananthram Swami,et al.  Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework , 2007, IEEE Journal on Selected Areas in Communications.

[21]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[22]  Hai Lin,et al.  Privacy Verification in POMDPs via Barrier Certificates , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[23]  Hai Lin,et al.  POMDP Model Learning for Human Robot Collaboration , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[24]  Olivier Buffet,et al.  MOMDPs: A Solution for Modelling Adaptive Management Problems , 2012, AAAI.

[25]  Swarat Chaudhuri,et al.  Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives , 2018, AAMAS.

[26]  Milos Hauskrecht,et al.  Planning treatment of ischemic heart disease with partially observable Markov decision processes , 2000, Artif. Intell. Medicine.