Reinforcement Learning for Resource Allocation in LEO Satellite Networks
暂无分享,去创建一个
[1] R. Stephenson. A and V , 1962, The British journal of ophthalmology.
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] M. Kurano. LEARNING ALGORITHMS FOR MARKOV DECISION PROCESSES , 1987 .
[4] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[5] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[6] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[7] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[8] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.
[9] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[10] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[11] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[12] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[13] Markus Werner,et al. ATM-Based Routing in LEO/MEO Satellite Networks with Intersatellite Links , 1997, IEEE J. Sel. Areas Commun..
[14] Markus Werner,et al. A Dynamic Routing Concept for ATM-Based Satellite Personal Communication Networks , 1997, IEEE J. Sel. Areas Commun..
[15] Markus Werner,et al. A neural network approach to distributed adaptive routing of LEO intersatellite link traffic , 1998, VTC '98. 48th IEEE Vehicular Technology Conference. Pathway to Global Wireless Revolution (Cat. No.98CH36151).
[16] H. Uzunalioglu,et al. Probabilistic routing protocol for low Earth orbit satellite networks , 1998, ICC '98. 1998 IEEE International Conference on Communications. Conference Record. Affiliated with SUPERCOMM'98 (Cat. No.98CH36220).
[17] John N. Tsitsiklis,et al. Simulation-based optimization of Markov reward processes , 1998, Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No.98CH36171).
[18] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[19] Vivek S. Borkar,et al. Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..
[20] P.T.S. Tam,et al. An optimized routing scheme and a channel reservation strategy for a low Earth orbit satellite system , 1999, Gateway to 21st Century Communications Village. VTC 1999-Fall. IEEE VTS 50th Vehicular Technology Conference (Cat. No.99CH36324).
[21] Sang Lyul Min,et al. A predictive call admission control scheme for low Earth orbit satellite networks , 2000, IEEE Trans. Veh. Technol..
[22] Ian F. Akyildiz,et al. A routing algorithm for connection‐oriented Low Earth Orbit (LEO) satellite networks with dynamic connectivity , 2000, Wirel. Networks.
[23] Fotini-Niovi Pavlidou,et al. Performance study of adaptive routing algorithms for LEO satellite constellations under Self-Similar and Poisson traffic , 2000, Space Commun..
[24] John N. Tsitsiklis,et al. Call admission control and routing in integrated services networks using neuro-dynamic programming , 2000, IEEE Journal on Selected Areas in Communications.
[25] Timothy X. Brown,et al. Adaptive call admission control under quality of service constraints: a reinforcement learning solution , 2000, IEEE Journal on Selected Areas in Communications.
[26] Jakob Carlström. Reinforcement learning for admission control and routing , 2000 .
[27] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[28] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
[29] Lex Weaver,et al. A Multi-Agent Policy-Gradient Approach to Network Routing , 2001, ICML.
[30] Leandros Tassiulas,et al. Provision of guaranteed services in broadband LEO satellite networks , 2002, Comput. Networks.
[31] J. Barria,et al. Markov decision theory framework for resource allocation in LEO satellite constellations , 2002 .
[32] Leonid Peshkin,et al. Reinforcement learning for adaptive routing , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[33] Laurent Franck,et al. Signaling for inter-satellite link routing in broadband non-GEO satellite systems , 2002, Comput. Networks.
[34] Pau-Lo Hsu,et al. A cooperative policy for conflict resolution to multi-agent exploration , 2010 .
[35] Chen-Khong Tham,et al. Adaptive provisioning of differentiated services networks based on reinforcement learning , 2003, IEEE Trans. Syst. Man Cybern. Part C.
[36] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[37] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[38] Javier A. Barria,et al. A reinforcement learning ticket-based probing path discovery scheme for MANETs , 2004, Ad Hoc Networks.
[39] Erol Gelenbe,et al. Self-aware networks and QoS , 2004, Proceedings of the IEEE.
[40] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[41] Wipawee Usaha. Resource allocation in networks with dynamic topology , 2004 .
[42] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[43] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.
[44] Abraham Thomas,et al. LEARNING ALGORITHMS FOR MARKOV DECISION PROCESSES , 2009 .