Nonparametric Compositional Stochastic Optimization for Risk-Sensitive Kernel Learning
暂无分享,去创建一个
Ketan Rajawat | Alec Koppel | Amrit Singh Bedi | Panchajanya Sanyal | Alec Koppel | A. S. Bedi | K. Rajawat | Panchajanya Sanyal
[1] S. Brendle,et al. Calculus of Variations , 1927, Nature.
[2] 中嶋 博. Convex Programming の新しい方法 (開学記念号) , 1966 .
[3] G. Wahba,et al. Some results on Tchebycheffian spline functions , 1971 .
[4] A. Zygmund,et al. Measure and integral : an introduction to real analysis , 1977 .
[5] C. D. Bailey. Hamilton's principle and the calculus of variations , 1982 .
[6] Y. Ermoliev. Stochastic quasigradient methods and their application to system optimization , 1983 .
[7] R. Durrett. Probability: Theory and Examples , 1993 .
[8] Y. C. Pati,et al. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.
[9] S. Hyakin,et al. Neural Networks: A Comprehensive Foundation , 1994 .
[10] J. Mark. Introduction to radial basis function networks , 1996 .
[11] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[12] Stan Uryasev,et al. Conditional value-at-risk: optimization algorithms and applications , 2000, Proceedings of the IEEE/IAFE/INFORMS 2000 Conference on Computational Intelligence for Financial Engineering (CIFEr) (Cat. No.00TH8520).
[13] Bernhard Schölkopf,et al. A Generalized Representer Theorem , 2001, COLT/EuroCOLT.
[14] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .
[15] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[16] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[17] Alexander J. Smola,et al. Online learning with kernels , 2001, IEEE Transactions on Signal Processing.
[18] Shie Mannor,et al. The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.
[19] J. Tsitsiklis,et al. Convergence rate of linear two-time-scale stochastic approximation , 2004, math/0405287.
[20] Pascal Vincent,et al. Kernel Matching Pursuit , 2002, Machine Learning.
[21] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[22] R. Olfati-Saber,et al. Consensus Filters for Sensor Networks and Distributed Sensor Fusion , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.
[23] Andrew Packard,et al. Control Applications of Sum of Squares Programming , 2005 .
[24] T. Poggio,et al. The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.
[25] Shabbir Ahmed,et al. Convexity and decomposition of mean-risk stochastic programs , 2006, Math. Program..
[26] Yuesheng Xu,et al. Universal Kernels , 2006, J. Mach. Learn. Res..
[27] A. Ruszczynski,et al. Optimization of Risk Measures , 2006 .
[28] Randy A. Freeman,et al. Distributed Cooperative Active Sensing Using Consensus Filters , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[29] H. Robbins. A Stochastic Approximation Method , 1951 .
[30] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[31] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[32] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[33] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[34] David Ruppert,et al. Semiparametric regression during 2003-2007. , 2009, Electronic journal of statistics.
[35] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .
[36] Slobodan Vucetic,et al. Online Passive-Aggressive Algorithms on a Budget , 2010, AISTATS.
[37] Sonia Martínez,et al. Discrete-time dynamic average consensus , 2010, Autom..
[38] W. Marsden. I and J , 2012 .
[39] Steven C. H. Hoi,et al. Fast Bounded Online Gradient Descent Algorithms for Scalable Kernel-Based Online Learning , 2012, ICML.
[40] Koby Crammer,et al. Breaking the curse of kernelization: budgeted stochastic gradient descent for large-scale SVM training , 2012, J. Mach. Learn. Res..
[41] Le Song,et al. Scalable Kernel Methods via Doubly Stochastic Gradients , 2014, NIPS.
[42] Jeff G. Schneider,et al. On the Error of Random Fourier Features , 2015, UAI.
[43] Zoltán Szabó,et al. Optimal Rates for Random Fourier Features , 2015, NIPS.
[44] A. Ruszczynski,et al. Statistical estimation of composite risk functionals and risk optimization problems , 2015, 1504.02658.
[45] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.
[46] Annette ten Teije,et al. Subseries of Lecture Notes in Computer Science , 2016 .
[47] Trung Le,et al. Nonparametric Budgeted Stochastic Gradient Descent , 2016, AISTATS.
[48] Na Li,et al. Harnessing smoothness to accelerate distributed optimization , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[49] Mengdi Wang,et al. Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions , 2014, Mathematical Programming.
[50] Mengdi Wang,et al. Finite-sum Composition Optimization via Variance Reduced Gradient Descent , 2016, AISTATS.
[51] Le Song,et al. Learning from Conditional Distributions via Dual Embeddings , 2016, AISTATS.
[52] Alejandro Ribeiro,et al. Parsimonious Online Learning with Kernels via sparse projections in function space , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] P. Stone,et al. Breaking Bellman's Curse of Dimensionality: Efficient Kernel Gradient Temporal Difference , 2017, 1709.04221.
[54] Recursive Optimization of Convex Risk Measures: Mean-Semideviation Models , 2018, 1804.00636.
[55] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[56] Alejandro Ribeiro,et al. Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems , 2018, 2018 Annual American Control Conference (ACC).
[57] Antonin Chambolle,et al. On Representer Theorems and Convex Regularization , 2018, SIAM J. Optim..
[58] Gesualdo Scutari,et al. Distributed nonconvex constrained optimization over time-varying digraphs , 2018, Mathematical Programming.
[59] Francesco Orabona,et al. Momentum-Based Variance Reduction in Non-Convex SGD , 2019, NeurIPS.
[60] Ketan Rajawat,et al. Controlling the Bias-Variance Tradeoff via Coherent Risk for Robust Learning with Kernels , 2019, 2019 American Control Conference (ACC).
[61] Zhu Li,et al. Towards a Unified Analysis of Random Fourier Features , 2018, ICML.
[62] Saeed Ghadimi,et al. A Single Timescale Stochastic Approximation Method for Nested Stochastic Optimization , 2018, SIAM J. Optim..
[63] Brian M. Sadler,et al. Optimally Compressed Nonparametric Online Learning: Tradeoffs between memory and consistency , 2020, IEEE Signal Processing Magazine.
[64] Peter Stone,et al. Policy Evaluation in Continuous MDPs With Efficient Kernelized Gradient Temporal Difference , 2017, IEEE Transactions on Automatic Control.
[65] Angelia Nedic,et al. Distributed stochastic gradient tracking methods , 2018, Mathematical Programming.