暂无分享,去创建一个
[1] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[2] Christophe MOY. IoTligent: First World-Wide Implementation of Decentralized Spectrum Learning for IoT Wireless Networks , 2019, 2019 URSI Asia-Pacific Radio Science Conference (AP-RASC).
[3] Stephan ten Brink,et al. OFDM-Autoencoder for End-to-End Learning of Communications Systems , 2018, 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).
[4] Yonghui Song,et al. A New Deep-Q-Learning-Based Transmission Scheduling Mechanism for the Cognitive Internet of Things , 2018, IEEE Internet of Things Journal.
[5] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[6] Jacques Palicot,et al. Proof-of-Concept System for Opportunistic Spectrum Access in Multi-user Decentralized Networks , 2016, EAI Endorsed Trans. Cogn. Commun..
[7] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[8] Mihaela van der Schaar,et al. Joint Physical-Layer and System-Level Power Management for Delay-Sensitive Wireless Communications , 2013, IEEE Transactions on Mobile Computing.
[9] Randy Paffenroth,et al. Multiobjective Reinforcement Learning for Cognitive Satellite Communications Using Deep Neural Network Ensembles , 2018, IEEE Journal on Selected Areas in Communications.
[10] Kobi Cohen,et al. Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.
[11] Jean-Marie Gorce,et al. An Upper Bound on the Error Induced by Saddlepoint Approximations—Applications to Information Theory † , 2020, Entropy.
[12] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[13] Ashutosh Sabharwal,et al. Delay-bounded packet scheduling of bursty traffic over wireless channels , 2004, IEEE Transactions on Information Theory.
[14] R. Munos,et al. Kullback–Leibler upper confidence bounds for optimal sequential allocation , 2012, 1210.1136.
[15] Christophe Moy,et al. Transfer restless multi-armed bandit policy for energy-efficient heterogeneous cellular network , 2019, EURASIP J. Adv. Signal Process..
[16] H. Robbins. Some aspects of the sequential design of experiments , 1952 .
[17] Gilles Stoltz. Incomplete information and internal regret in prediction of individual sequences , 2005 .
[18] Maryline Hélard,et al. Energy Minimization in HARQ-I Relay-Assisted Networks With Delay-Limited Users , 2017, IEEE Transactions on Vehicular Technology.
[19] Abhijeet Bhorkar,et al. An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel , 2008, IEEE Journal on Selected Areas in Communications.
[20] H. Vincent Poor,et al. Channel Coding Rate in the Finite Blocklength Regime , 2010, IEEE Transactions on Information Theory.
[21] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[22] T. L. Lai Andherbertrobbins. Asymptotically Efficient Adaptive Allocation Rules , 2022 .
[23] Csaba Szepesvári,et al. Learning and Exploitation Do Not Conflict Under Minimax Optimality , 1997, ECML.
[24] Vinod Sharma,et al. Power constrained and delay optimal policies for scheduling transmission over a fading channel , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).
[25] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[26] D. Ernst,et al. Upper Confidence Bound Based Decision Making Strategies and Dynamic Spectrum Access , 2010, 2010 IEEE International Conference on Communications.
[27] Christophe Moy,et al. QoS Driven Channel Selection Algorithm for Cognitive Radio Network: Multi-User Multi-Armed Bandit Approach , 2017, IEEE Transactions on Cognitive Communications and Networking.
[28] H. Vincent Poor,et al. Spectrum Exploration and Exploitation for Cognitive Radio: Recent Advances , 2015, IEEE Signal Processing Magazine.
[29] Erik G. Larsson,et al. Spectrum sensing for cognitive radio : State-ofthe-art and recent advances , 2012 .
[30] Bhaskar Krishnamachari,et al. Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.
[31] Christophe Moy,et al. Reinforcement Learning Real Experiments for Opportunistic Spectrum Access , 2014 .
[32] Walid Saad,et al. Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective , 2017, IEEE Transactions on Wireless Communications.
[33] Laurent Toutain,et al. Decentralized spectrum learning for radio collision mitigation in ultra-dense IoT networks: LoRaWAN case study and experiments , 2020, Ann. des Télécommunications.
[34] Stephan ten Brink,et al. Deep Learning Based Communication Over the Air , 2017, IEEE Journal of Selected Topics in Signal Processing.
[35] Mingyan Liu,et al. Online learning in opportunistic spectrum access: A restless bandit approach , 2010, 2011 Proceedings IEEE INFOCOM.
[36] Alireza Sadeghi,et al. Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.
[37] Zhenyu Liao,et al. A Random Matrix Approach to Neural Networks , 2017, ArXiv.
[38] Vincent K. N. Lau,et al. Cross-Layer Design for OFDMA Wireless Systems With Heterogeneous Delay Requirements , 2007, IEEE Transactions on Wireless Communications.
[39] Vikram Krishnamurthy,et al. Monotonicity of Constrained Optimal Transmission Policies in Correlated Fading Channels With ARQ , 2010, IEEE Transactions on Signal Processing.
[40] Ananthram Swami,et al. Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret , 2010, IEEE Journal on Selected Areas in Communications.
[41] Qi Hao,et al. Deep Learning for Intelligent Wireless Networks: A Comprehensive Survey , 2018, IEEE Communications Surveys & Tutorials.
[42] Santiago Zazo,et al. Hybrid UCB-HMM: A Machine Learning Strategy for Cognitive Radio in HF Band , 2015, IEEE Transactions on Cognitive Communications and Networking.
[43] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[44] Senem Velipasalar,et al. Deep Reinforcement Learning-Based Edge Caching in Wireless Networks , 2020, IEEE Transactions on Cognitive Communications and Networking.
[45] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[46] Ying-Chang Liang,et al. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.
[47] J. Walrand,et al. Asymptotically efficient allocation rules for the multiarmed bandit problem with multiple plays-Part II: Markovian rewards , 1987 .
[48] Wassim Jouini,et al. Multi-armed bandit based policies for cognitive radio's decision making issues , 2009, 2009 3rd International Conference on Signals, Circuits and Systems (SCS).
[49] Shuguang Cui,et al. Reinforcement Learning-Based Multiaccess Control and Battery Prediction With Energy Harvesting in IoT Systems , 2018, IEEE Internet of Things Journal.
[50] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[51] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[52] Victor C. M. Leung,et al. Deep-Reinforcement-Learning-Based Optimization for Cache-Enabled Opportunistic Interference Alignment Wireless Networks , 2017, IEEE Transactions on Vehicular Technology.
[53] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[54] Mérouane Debbah,et al. Wireless Networks Design in the Era of Deep Learning: Model-Based, AI-Based, or Both? , 2019, IEEE Transactions on Communications.
[55] H. Vincent Poor,et al. A sensing policy based on confidence bounds and a restless multi-armed bandit model , 2012, 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).
[56] Bhaskar Krishnamachari,et al. Dynamic Base Station Switching-On/Off Strategies for Green Cellular Networks , 2013, IEEE Transactions on Wireless Communications.
[57] Peter Auer,et al. The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..
[58] Ursula Challita,et al. Artificial Neural Networks-Based Machine Learning for Wireless Networks: A Tutorial , 2017, IEEE Communications Surveys & Tutorials.
[59] Emilie Kaufmann,et al. Analysis of bayesian and frequentist strategies for sequential resource allocation. (Analyse de stratégies bayésiennes et fréquentistes pour l'allocation séquentielle de ressources) , 2014 .
[60] Visa Koivunen,et al. An Order Optimal Policy for Exploiting Idle Spectrum in Cognitive Radio Networks , 2015, IEEE Transactions on Signal Processing.
[61] Jakob Hoydis,et al. An Introduction to Deep Learning for the Physical Layer , 2017, IEEE Transactions on Cognitive Communications and Networking.
[62] Visa Koivunen,et al. Bayesian Methods for Multiple Change-Point Detection With Reduced Communication , 2020, IEEE Transactions on Signal Processing.
[63] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[64] Mingyan Liu,et al. Online Learning of Rested and Restless Bandits , 2011, IEEE Transactions on Information Theory.