暂无分享,去创建一个
[1] Marc Peter Deisenroth,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[2] P. Alam. ‘S’ , 2021, Composites Engineering: An A–Z Guide.
[3] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[4] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[5] P. Alam. ‘G’ , 2021, Composites Engineering: An A–Z Guide.
[6] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[7] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[8] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[9] M. Manhart,et al. Markov Processes , 2018, Introduction to Stochastic Processes and Simulation.
[10] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[11] P. Alam,et al. H , 1887, High Explosives, Propellants, Pyrotechnics.
[12] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[13] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[14] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[15] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[16] B. Perthame. Perturbed dynamical systems with an attracting singularity and weak viscosity limits in Hamilton-Jacobi equations , 1990 .
[17] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[18] S. Ethier,et al. Markov Processes: Characterization and Convergence , 2005 .
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[21] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[22] Anil A. Bharath,et al. Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.
[23] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[24] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[25] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[26] C. Watkins. Learning from delayed rewards , 1989 .
[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[28] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[29] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[30] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[31] Justin A. Sirignano,et al. Mean Field Analysis of Deep Neural Networks , 2019, Math. Oper. Res..
[32] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[33] V. Borkar. Asynchronous Stochastic Approximations , 1998 .
[34] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[35] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[36] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[37] P. Alam. ‘A’ , 2021, Composites Engineering: An A–Z Guide.
[38] Jan Peters,et al. Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..
[39] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[40] Kurt Hornik,et al. Convergence of learning algorithms with constant learning rates , 1991, IEEE Trans. Neural Networks.
[41] Grant M. Rotskoff,et al. Neural Networks as Interacting Particle Systems: Asymptotic Convexity of the Loss Landscape and Universal Scaling of the Approximation Error , 2018, ArXiv.
[42] Yann LeCun,et al. Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[43] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[44] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[45] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[46] Yoshifusa Ito,et al. Nonlinearity creates linear independence , 1996, Adv. Comput. Math..
[47] Tsuyoshi Murata,et al. {m , 1934, ACML.
[48] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.