Multi-Armed Bandit Problems
暂无分享,去创建一个
[1] J. I. The Design of Experiments , 1936, Nature.
[2] R Bellman,et al. On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.
[3] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[4] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[5] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[6] Michael Horstein,et al. Sequential transmission using noiseless feedback , 1963, IEEE Trans. Inf. Theory.
[7] P. B. Coaker,et al. Applied Dynamic Programming , 1964 .
[8] D. Blackwell. Discounted Dynamic Programming , 1965 .
[9] C. Striebel. Sufficient statistics in the optimum control of stochastic systems , 1965 .
[10] Rutherford Aris,et al. Discrete Dynamic Programming , 1965, The Mathematical Gazette.
[11] Walter T. Federer,et al. Sequential Design of Experiments , 1967 .
[12] Harry L. Van Trees,et al. Detection, Estimation, and Modulation Theory, Part I , 1968 .
[13] J. Andel. Sequential Analysis , 2022, The SAGE Encyclopedia of Research Design.
[14] Martin J. Beckmann. Dynamic programming of economic decisions , 1969 .
[15] M. Degroot. Optimal Statistical Decisions , 1970 .
[16] Edward J. Sondik,et al. The optimal control of par-tially observable Markov processes , 1971 .
[17] W. J. Studden,et al. Theory Of Optimal Experiments , 1972 .
[18] D. Sworder,et al. Introduction to stochastic control , 1972 .
[19] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[20] G. Simons. Great Expectations: Theory of Optimal Stopping , 1973 .
[21] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[22] Erhan Çinlar,et al. Introduction to stochastic processes , 1974 .
[23] Antoine-S Bailly,et al. Science régionale - Walter Isard, Introduction to régional science. Englewood Cliffs (NJ), Prentice-Hall, 1975 , 1975 .
[24] Alʹbert Nikolaevich Shiri︠a︡ev,et al. Optimal stopping rules , 1977 .
[25] Robert E. Larson,et al. Principles of Dynamic Programming , 1978 .
[26] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[27] Martin L. Puterman,et al. Dynamic Programming and Its Application , 1979 .
[28] M. Skolnik,et al. Introduction to Radar Systems , 2021, Advances in Adaptive Radar Detection and Range Estimation.
[29] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .
[30] P. Whittle. Multi‐Armed Bandits and the Gittins Index , 1980 .
[31] E. Angel,et al. Principles of dynamic programming part 1 , 1980, Proceedings of the IEEE.
[32] F. Kelly. Multi-Armed Bandits with Discount Factor Near One: The Bernoulli Case , 1981 .
[33] P. Whittle. Arm-Acquiring Bandits , 1981 .
[34] R. Hartley,et al. Optimisation Over Time: Dynamic Programming and Stochastic Control: , 1983 .
[35] R. Gray,et al. Vector quantization , 1984, IEEE ASSP Magazine.
[36] Y. Bar-Shalom,et al. Detection thresholds for tracking in clutter--A connection between estimation and signal processing , 1985 .
[37] Jean Walrand,et al. Extensions of the multiarmed bandit problem: The discounted case , 1985 .
[38] J. Tsitsiklis. A lemma on the multiarmed bandit problem , 1986 .
[39] Pravin Varaiya,et al. Stochastic Systems: Estimation, Identification, and Adaptive Control , 1986 .
[40] Samuel S. Blackman,et al. Multiple-Target Tracking with Radar Applications , 1986 .
[41] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .
[42] H. Chernoff. Sequential Analysis and Optimal Design , 1987 .
[43] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[44] Michael N. Katehakis,et al. The Multi-Armed Bandit Problem: Decomposition and Computation , 1987, Math. Oper. Res..
[45] A. Mandelbaum. CONTINUOUS MULTI-ARMED BANDITS AND MULTIPARAMETER PROCESSES , 1987 .
[46] D. Teneketzis,et al. Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost , 1988 .
[47] P. Whittle. Restless Bandits: Activity Allocation in a Changing World , 1988 .
[48] D. Teneketzis,et al. Asymptotically Efficient Adaptive Allocation Schemes for Controlled I.I.D. Processes: Finite Paramet , 1988 .
[49] Yaakov Bar-Shalom,et al. Multitarget-multisensor tracking: Advanced applications , 1989 .
[50] R. Agrawal,et al. Certainty equivalence control with forcing: revisited , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[51] Christian M. Ernst,et al. Multi-armed Bandit Allocation Indices , 1989 .
[52] R. Agrawal,et al. Asymptotically efficient adaptive allocation schemes for controlled Markov chains: finite parameter space , 1989 .
[53] J. Bather,et al. Multi‐Armed Bandit Allocation Indices , 1990 .
[54] R. Agrawal,et al. Multi-armed bandit problems with multiple plays and switching cost , 1990 .
[55] Bert-Eric Tullsson. Monopulse tracking of Rayleigh targets: a simple approach , 1991 .
[56] Kenneth J. Hintz,et al. A measure of the information gain attributable to cueing , 1991, IEEE Trans. Syst. Man Cybern..
[57] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[58] Andrew R. Barron,et al. Complexity Regularization with Application to Artificial Neural Networks , 1991 .
[59] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[60] Eugene S. McVey,et al. Multi-process constrained estimation , 1991, IEEE Trans. Syst. Man Cybern..
[61] David J. C. MacKay,et al. Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.
[62] R. Weber. On the Gittins Index for Multiarmed Bandits , 1992 .
[63] D. Teneketzis,et al. Optimality of index policies for stochastic scheduling with switching penalties , 1992, Journal of Applied Probability.
[64] N. Gordon,et al. Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .
[65] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.
[66] Dimitris Bertsimas,et al. Conservation laws, extended polymatroids and multi-armed bandit problems: a unified approach to ind exable systems , 2011, IPCO.
[67] A. Tsybakov,et al. Minimax theory of image reconstruction , 1993 .
[68] D. Teneketzis,et al. Optimal stochastic scheduling of forest networks with switching penalties , 1994, Advances in Applied Probability.
[69] S. Musick,et al. Chasing the elusive sensor manager , 1994, Proceedings of National Aerospace and Electronics Conference (NAECON'94).
[70] Robin J. Evans,et al. Optimal waveform selection for tracking systems , 1994, IEEE Trans. Inf. Theory.
[71] J. Tsitsiklis,et al. Branching bandits and Klimov's problem: achievable region and side constraints , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.
[72] David A. Cohn,et al. Active Learning with Statistical Models , 1996, NIPS.
[73] Michael Jackson,et al. Optimal Design of Experiments , 1994 .
[74] J. Banks,et al. Switching Costs and the Gittins Index , 1994 .
[75] Partha Niyogi,et al. Active Learning for Function Approximation , 1994, NIPS.
[76] Keith D. Kastella,et al. Event-averaged maximum likelihood estimation and information-based sensor management , 1994, Defense, Security, and Sensing.
[77] Anders Krogh,et al. Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.
[78] P. Varaiya,et al. Multi-Armed bandit problem revisited , 1994 .
[79] John Rust. Using Randomization to Break the Curse of Dimensionality , 1997 .
[80] I. Karatzas,et al. Dynamic Allocation Problems in Continuous Time , 1994 .
[81] M. Littman. The Witness Algorithm: Solving Partially Observable Markov Decision Processes , 1994 .
[82] Robin J. Evans,et al. Integrated probabilistic data association , 1994, IEEE Trans. Autom. Control..
[83] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[84] I. J. Taneja. New Developments in Generalized Information Measures , 1995 .
[85] Michael I. Miller,et al. Conditional-mean estimation via jump-diffusion processes in multiple target tracking/recognition , 1995, IEEE Trans. Signal Process..
[86] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[87] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[88] John Rust. Numerical dynamic programming in economics , 1996 .
[89] Demosthenis Teneketzis,et al. Multi-armed bandits with switching penalties , 1996, IEEE Trans. Autom. Control..
[90] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[91] Dimitris Bertsimas,et al. Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems , 1996, Math. Oper. Res..
[92] M. Katehakis,et al. Finite state multi-armed bandit problems: sensitive-discount, average-reward and average-overtaking optimality , 1996 .
[93] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .
[94] Lawrence Carin,et al. Matching pursuits with a wave-based dictionary , 1997, IEEE Trans. Signal Process..
[95] R.J. Evans,et al. Waveform selective probabilistic data association , 1997, IEEE Transactions on Aerospace and Electronic Systems.
[96] Keith Kastella. Discrimination gain to optimize detection and classification , 1997, IEEE Trans. Syst. Man Cybern. Part A.
[97] Jun S. Liu,et al. Sequential Monte Carlo methods for dynamic systems , 1997 .
[98] I. J. Won,et al. GEM‐3: A Monostatic Broadband Electromagnetic Induction Sensor , 1997 .
[99] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[100] D. Castañón. Approximate dynamic programming for sensor management , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[101] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[102] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[103] D.A. Castanon,et al. Rollout Algorithms for Stochastic Scheduling Problems , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[104] Douglas Cochran,et al. Dynamic estimation with selectable linear measurements , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[105] Quentin F. Stout,et al. Flexible Algorithms for Creating and Analyzing Adaptive Sampling Procedures , 1998 .
[106] A. Doucet. On sequential Monte Carlo methods for Bayesian filtering , 1998 .
[107] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[108] W. Blair,et al. Unresolved Rayleigh target detection using monopulse measurements , 1998 .
[109] A. Mandelbaum,et al. Multi-armed bandits in discrete and continuous time , 1998 .
[110] Lawrence Carin,et al. Multiaspect identification of submerged elastic targets via wave-based matching pursuits and hidden , 1999 .
[111] Lawrence Carin,et al. Hidden Markov models for multiaspect target classification , 1999, IEEE Trans. Signal Process..
[112] M. Pitt,et al. Filtering via Simulation: Auxiliary Particle Filters , 1999 .
[113] Vladimir Vapnik,et al. An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.
[114] Demosthenis Teneketzis,et al. On the optimality of the Gittins index rule for multi-armed bandits with multiple plays , 1995, Math. Methods Oper. Res..
[115] Lawrence D. Stone,et al. Bayesian Multiple Target Tracking , 1999 .
[116] R. Viswanathan,et al. Performance of distributed CFAR test under various clutter amplitudes , 1999 .
[117] Y. Bar-Shalom,et al. From the waveform through the resolution cell to the tracker , 1999, 1999 IEEE Aerospace Conference. Proceedings (Cat. No.99TH8403).
[118] Carl E. Baum,et al. On the low-frequency natural response of conducting and permeable targets , 1999, IEEE Trans. Geosci. Remote. Sens..
[119] A. Korostelev. On minimax rates of convergence in image models under sequential design , 1999 .
[120] Douglas Cochran,et al. Source detection and localization using a multi-mode detector: a Bayesian approach , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[121] Daphne Koller,et al. Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.
[122] Robert Givan,et al. A framework for simulation-based network control via hindsight optimization , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).
[123] José Niño Mora. Restless Bandits, Partial Conservation Laws and Indexability , 2000 .
[124] Demosthenis Teneketzis,et al. ON THE OPTIMALITY OF AN INDEX RULE IN MULTICHANNEL ALLOCATION FOR SINGLE-HOP MOBILE NETWORKS WITH MULTIPLE SERVICE CLASSES , 2000 .
[125] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .
[126] Shun-ichi Amari,et al. Methods of information geometry , 2000 .
[127] Yaakov Bar-Shalom,et al. Multitarget/Multisensor Tracking: Applications and Advances -- Volume III , 2000 .
[128] Dimitris Bertsimas,et al. Restless Bandits, Linear Programming Relaxations, and a Primal-Dual Index Heuristic , 2000, Oper. Res..
[129] D. Cochran,et al. Multi-mode detection with Markov target motion , 2000, Proceedings of the Third International Conference on Information Fusion.
[130] A. Korostelev,et al. Rates of convergence for the sup-norm risk in image models under sequential designs , 2000 .
[131] Robin J. Evans,et al. Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking , 2001, IEEE Trans. Signal Process..
[132] Yacine Dalichaouch,et al. On the wideband EMI response of a rotationally symmetric permeable and conducting target , 2001, IEEE Trans. Geosci. Remote. Sens..
[133] Fredrik Gustafsson,et al. Monte Carlo data association for multiple target tracking , 2001 .
[134] Daphne Koller,et al. Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..
[135] Aleksandar Dogandzic,et al. Cramer-Rao bounds for estimating range, velocity, and direction with an active array , 2001, IEEE Trans. Signal Process..
[136] Michael Isard,et al. BraMBLe: a Bayesian multiple-blob tracker , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[137] Nicola Secomandi,et al. A Rollout Policy for the Vehicle Routing Problem with Stochastic Demands , 2001, Oper. Res..
[138] Eric Gottlieb,et al. The Umbra Simulation Framework , 2001 .
[139] J. D. Gorman,et al. Alpha-Divergence for Classification, Indexing and Retrieval (Revised 2) , 2002 .
[140] R. B. Washburn,et al. Stochastic dynamic programming based approaches to sensor resource management , 2002, Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997).
[141] K. Glazebrook,et al. Index policies for a class of discounted restless bandits , 2002, Advances in Applied Probability.
[142] Patrick Pérez,et al. Sequential Monte Carlo methods for multiple target tracking and data fusion , 2002, IEEE Trans. Signal Process..
[143] William Fitzgerald,et al. A Bayesian approach to tracking multiple targets using sensor arrays and particle filters , 2002, IEEE Trans. Signal Process..
[144] P. Pérez,et al. Tracking multiple objects with particle filtering , 2002 .
[145] José Niño-Mora,et al. Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach , 2002, Math. Program..
[146] Feng Zhao,et al. Information-driven dynamic sensor collaboration , 2002, IEEE Signal Process. Mag..
[147] D.A. Castanon,et al. Model predictive control for dynamic unreliable resource allocation , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..
[148] Raymond W. Yeung,et al. A First Course in Information Theory , 2002 .
[149] Neil J. Gordon,et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..
[150] Vikram Krishnamurthy,et al. Algorithms for optimal scheduling and management of hidden Markov model sensors , 2002, IEEE Trans. Signal Process..
[151] L. Shepp. Probability Essentials , 2002 .
[152] Gang Wu,et al. Burst-level congestion control using hindsight optimization , 2002, IEEE Trans. Autom. Control..
[153] A. Doucet,et al. Particle filtering for multi-target tracking and sensor management , 2002, Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997).
[154] M. Veth,et al. Affordable moving surface target engagement , 2002, Proceedings, IEEE Aerospace Conference.
[155] Alfred O. Hero,et al. Applications of entropic spanning graphs , 2002, IEEE Signal Process. Mag..
[156] Henk A. P. Blom,et al. Joint IMMPDA particle filter , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.
[157] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[158] Robin J. Evans,et al. Correction to "Hidden Markov model multiarm bandits: a methodology for beam scheduling in multitarget tracking" , 2003, IEEE Trans. Signal Process..
[159] Neil J. Gordon,et al. Efficient particle filtering for multiple target tracking with application to tracking in structured images , 2003, Image Vis. Comput..
[160] Timothy J. Robinson,et al. Sequential Monte Carlo Methods in Practice , 2003 .
[161] Kevin D. Glazebrook,et al. Whittle's index policy for a multi-class queueing system with convex holding costs , 2003, Math. Methods Oper. Res..
[162] D. Fox,et al. People Tracking with Anonymous and ID-Sensors Using Rao-Blackwellised Particle Filters , 2003, IJCAI.
[163] Eric V. Denardo,et al. Dynamic Programming: Models and Applications , 2003 .
[164] Leslie M. Collins,et al. Sensing of unexploded ordnance with magnetometer and induction data: theory and signal processing , 2003, IEEE Trans. Geosci. Remote. Sens..
[165] Alfred O. Hero,et al. Multi-target Sensor Management Using Alpha-Divergence Measures , 2003, IPSN.
[166] P. Hall,et al. Sequential methods for design-adaptive estimation of discontinuities in regression curves and surfaces , 2003 .
[167] H. Sebastian Seung,et al. Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.
[168] Nah-Oak Song,et al. Discrete search with multiple sensors , 2004, Math. Methods Oper. Res..
[169] R. Nowak,et al. Backcasting: adaptive sampling for sensor networks , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.
[170] Y. Bar-Shalom,et al. Multisensor resource deployment using posterior Cramer-Rao bounds , 2004, IEEE Transactions on Aerospace and Electronic Systems.
[171] Rebecca Willett,et al. Coarse-to-fine manifold learning , 2004 .
[172] A. Hero,et al. Efficient methods of non-myopic sensor management for multitarget tracking , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).
[173] Robert Givan,et al. Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes , 2004, Discret. Event Dyn. Syst..
[174] Mingyan Liu,et al. On the optimality of an index policy for bandwidth allocation with delayed state observation and differentiated services , 2004, IEEE INFOCOM 2004.
[175] Jeffrey K. Uhlmann,et al. Unscented filtering and nonlinear estimation , 2004, Proceedings of the IEEE.
[176] D. Ruppert. The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .
[177] Robert D. Nowak,et al. Coarse-to-fine manifold learning [image processing example] , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[178] Alfred O. Hero,et al. Multiple Model Particle Filtering For Multi-Target Tracking , 2004 .
[179] Ryan M. Rifkin,et al. In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..
[180] Robert R. Tenney,et al. Dynamic tactical targeting , 2004, SPIE Defense + Commercial Sensing.
[181] Pascal Vincent,et al. Kernel Matching Pursuit , 2002, Machine Learning.
[182] Yishay Mansour,et al. A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes , 1999, Machine Learning.
[183] Mingyan Liu,et al. Properties of optimal resource sharing in a delay channel , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).
[184] Lawrence Carin,et al. Detection of buried targets via active selection of labeled data: application to sensing subsurface UXO , 2004, IEEE Transactions on Geoscience and Remote Sensing.
[185] M.K. Schneider,et al. Closing the loop in sensor fusion systems: stochastic dynamic programming approaches , 2004, Proceedings of the 2004 American Control Conference.
[186] Alfred O. Hero,et al. Information-based sensor management for multitarget tracking , 2004, SPIE Optics + Photonics.
[187] A. Papandreou-Suppappola,et al. Efficient search strategies for non-myopic sensor scheduling in target tracking , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..
[188] R. Evans,et al. Clutter map information for data association and track initialization , 2004, IEEE Transactions on Aerospace and Electronic Systems.
[189] Urbashi Mitra,et al. Estimating inhomogeneous fields using wireless sensor networks , 2004, IEEE Journal on Selected Areas in Communications.
[190] S. Challa,et al. Multi Target Tracking of Ground Targets in Clutter with LMIPDA-IMM , 2004 .
[191] R. Nowak,et al. Multiscale likelihood analysis and complexity penalized estimation , 2004, math/0406424.
[192] Ying He,et al. Sensor scheduling for target tracking in sensor networks , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).
[193] Hui Li,et al. An M-ary KMP classifier for multi-aspect target classification , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[194] Dimitri P. Bertsekas,et al. Discretized Approximations for POMDP with Average Cost , 2004, UAI.
[195] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[196] P. Whittle. Tax problems in the undiscounted case , 2005 .
[197] D. Geman,et al. Hierarchical testing designs for pattern recognition , 2005, math/0507421.
[198] Alfred O. Hero,et al. From Weighted Classification to Policy Search , 2005, NIPS.
[199] K. Kastella,et al. A Comparison of Task Driven and Information Driven Sensor Management for Target Tracking , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.
[200] Ronald E. Parr,et al. Non-Myopic Multi-Aspect Sensing with Partially Observable Markov Decision Processes , 2005 .
[201] Douglas Cochran,et al. Waveform-agile sensing: opportunities and challenges , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[202] A. Hero,et al. Multitarget tracking using the joint multitarget probability density , 2005, IEEE Transactions on Aerospace and Electronic Systems.
[203] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[204] Darryl Morrell,et al. Time-varying waveform selection and configuration for agile sensors in tracking applications , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[205] Robert D. Nowak,et al. Faster Rates in Regression via Active Learning , 2005, NIPS.
[206] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[207] Alfred O. Hero,et al. Sensor management using an active sensing approach , 2005, Signal Process..
[208] Edwin K. P. Chong,et al. Sensor scheduling for target tracking: A Monte Carlo sampling approach , 2006, Digit. Signal Process..
[209] Jian Wang,et al. Maximum Likelihood Estimation of Compound-Gaussian Clutter and Target Parameters , 2006, IEEE Transactions on Signal Processing.
[210] A. Hero,et al. Optimal Sensor Scheduling via Classification Reduction of Policy Search ( CROPS ) , 2006 .
[211] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[212] Mingyan Liu,et al. Optimal bandwidth allocation in a delay channel , 2006, IEEE Journal on Selected Areas in Communications.
[213] Alfred O. Hero,et al. Adaptive multi-modality sensor scheduling for detection and tracking of smart targets , 2006, Digit. Signal Process..
[214] A. Singh,et al. Active learning for adaptive mobile sensing networks , 2006, 2006 5th International Conference on Information Processing in Sensor Networks.
[215] R.J. Evans,et al. Waveform Libraries for Radar Tracking Applications: Maneuvering Targets , 2006, 2006 40th Annual Conference on Information Sciences and Systems.
[216] Alfred O. Hero,et al. On Dimensionality Reduction for Classification and its Application , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[217] A. Robert Calderbank,et al. Adaptive Waveform Design for Improved Detection of Low-RCS Targets in Heavy Sea Clutter , 2007, IEEE Journal of Selected Topics in Signal Processing.
[218] Lawrence Carin,et al. Nonmyopic Multiaspect Sensing With Partially Observable Markov Decision Processes , 2007, IEEE Transactions on Signal Processing.
[219] Mark R. Morelande,et al. A Bayesian Approach to Multiple Target Detection and Tracking , 2007, IEEE Transactions on Signal Processing.
[220] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[221] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[222] Alfred O. Hero,et al. An Information-Based Approach to Sensor Management in Large Dynamic Networks , 2007, Proceedings of the IEEE.
[223] R. Weber,et al. On an index policy for restless bandits , 1990, Journal of Applied Probability.
[224] Mingyan Liu,et al. Server allocation with delayed state observation: Sufficient conditions for the optimality of an index policy , 2009, IEEE Transactions on Wireless Communications.
[225] L. Breuer. Introduction to Stochastic Processes , 2022, Statistical Methods for Climate Scientists.
[226] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .