Missile defense and interceptor allocation by neuro-dynamic programming

This paper proposes a solution methodology for a missile defense problem involving the sequential allocation of defensive resources over a series of engagements. The problem is cast as a dynamic programming/Markovian decision problem, which is computationally intractable by exact methods because of its large number of states and its complex modeling issues. We employed a neuro-dynamic programming framework, whereby the cost-to-go function is approximated using neural network architectures that are trained on simulated data. We report on the performance obtained using several different training methods, and we compare this performance with the optimal approach.