Reasoning About Generalization via Conditional Mutual Information
暂无分享,去创建一个
[1] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .
[2] Thomas Steinke,et al. Calibrating Noise to Variance in Adaptive Data Analysis , 2017, COLT.
[3] Jan Vondrák,et al. High probability generalization bounds for uniformly stable algorithms with nearly optimal rate , 2019, COLT.
[4] Mario Baum. An Introduction To Computational Learning Theory , 2016 .
[5] Noga Alon,et al. Scale-sensitive dimensions, uniform convergence, and learnability , 1997, JACM.
[6] J. Steele. An Efron-Stein inequality for nonsymmetric statistics , 1986 .
[7] M. Talagrand. Sharper Bounds for Gaussian and Empirical Processes , 1994 .
[8] David Haussler,et al. Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..
[9] Moni Naor,et al. Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.
[10] James Zou,et al. Controlling Bias in Adaptive Data Analysis Using Information Theory , 2015, AISTATS.
[11] Raef Bassily,et al. Algorithmic stability for adaptive data analysis , 2015, STOC.
[12] Giuseppe Durisi,et al. Generalization Bounds via Information Density and Conditional Information Density , 2020, IEEE Journal on Selected Areas in Information Theory.
[13] Stephen E. Fienberg,et al. On-Average KL-Privacy and Its Equivalence to Generalization for Max-Entropy Mechanisms , 2016, PSD.
[14] Vitaly Feldman,et al. PAC learning with stable and private predictions , 2019, COLT 2020.
[15] Toniann Pitassi,et al. Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.
[16] Shaofeng Zou,et al. Tightening Mutual Information Based Bounds on Generalization Error , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[17] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[18] R. Handel. Probability in High Dimension , 2014 .
[19] Raef Bassily,et al. Learners that Use Little Information , 2017, ALT.
[20] Maxim Raginsky,et al. Information-theoretic analysis of stability and bias of learning algorithms , 2016, 2016 IEEE Information Theory Workshop (ITW).
[21] Amir Yehudayoff,et al. A Direct Sum Result for the Information Complexity of Learning , 2018, COLT.
[22] Sergio Verdú,et al. Chaining Mutual Information and Tightening Generalization Bounds , 2018, NeurIPS.
[23] Gintare Karolina Dziugaite,et al. Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms , 2020, NeurIPS.
[24] E. Zermelo. Beweis, daß jede Menge wohlgeordnet werden kann , 1904 .
[25] Aaron Roth,et al. Max-Information, Differential Privacy, and Post-selection Hypothesis Testing , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).
[26] Toniann Pitassi,et al. Generalization in Adaptive Data Analysis and Holdout Reuse , 2015, NIPS.
[27] Jan Vondrák,et al. Generalization Bounds for Uniformly Stable Algorithms , 2018, NeurIPS.
[28] Maxim Raginsky,et al. Information-theoretic analysis of generalization capability of learning algorithms , 2017, NIPS.
[29] Thomas Steinke,et al. Interactive fingerprinting codes and the hardness of preventing false discovery , 2014, 2016 Information Theory and Applications Workshop (ITA).
[30] R. Gray. Entropy and Information Theory , 1990, Springer New York.
[31] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[32] Dana Ron,et al. Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation , 1997, Neural Computation.
[33] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.
[34] Noga Alon,et al. Limits of Private Learning with Access to Public Data , 2019, NeurIPS.
[35] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[36] W. Hoeffding. Probability Inequalities for sums of Bounded Random Variables , 1963 .
[37] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[38] Michael Gastpar,et al. Strengthened Information-theoretic Bounds on the Generalization Error , 2019, 2019 IEEE International Symposium on Information Theory (ISIT).
[39] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[40] Luc Devroye,et al. Distribution-free inequalities for the deleted and holdout error estimates , 1979, IEEE Trans. Inf. Theory.
[41] S. Varadhan,et al. Asymptotic evaluation of certain Markov process expectations for large time , 1975 .
[42] Moni Naor,et al. Small-bias probability spaces: efficient constructions and applications , 1990, STOC '90.
[43] Colin McDiarmid,et al. Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .
[44] Ibrahim M. Alabdulmohsin. Uniform Generalization, Concentration, and Adaptive Learning , 2016, ArXiv.
[45] Amir Yehudayoff,et al. Average-Case Information Complexity of Learning , 2018, ALT.
[46] Philip M. Long,et al. Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.
[47] Orval M. Klose,et al. Bounds for the Variance of the Mann-Whitney Statistic , 1957 .
[48] Vitaly Feldman,et al. Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back , 2016, NIPS.
[49] Christopher Jung,et al. A new analysis of differential privacy’s generalization guarantees (invited paper) , 2019, ITCS.
[50] S. Shelah. A combinatorial problem; stability and order for models and theories in infinitary languages. , 1972 .
[51] Toniann Pitassi,et al. The Limits of Two-Party Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.
[52] Katrina Ligett,et al. A necessary and sufficient stability notion for adaptive generalization , 2019 .
[53] Thomas Steinke,et al. The Limits of Post-Selection Generalization , 2018, NeurIPS.
[54] W. Rogers,et al. A Finite Sample Distribution-Free Performance Bound for Local Discrimination Rules , 1978 .
[55] Fady Alajaji,et al. Rényi divergence measures for commonly used univariate continuous distributions , 2013, Inf. Sci..
[56] Aaron Roth,et al. Adaptive Learning with Robust Generalization Guarantees , 2016, COLT.
[57] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[58] Michael Gastpar,et al. A New Approach to Adaptive Data Analysis and Learning via Maximal Leakage , 2019, ArXiv.
[59] Jianhua Lin,et al. Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.
[60] Peter Harremoës,et al. Rényi Divergence and Kullback-Leibler Divergence , 2012, IEEE Transactions on Information Theory.
[61] V. Parmon,et al. Entropy and Information , 2009 .
[62] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.
[63] Thomas M. Cover,et al. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing) , 2006 .
[64] David A. McAllester. Some PAC-Bayesian Theorems , 1998, COLT' 98.
[65] Jonathan Ullman,et al. Preventing False Discovery in Interactive Data Analysis Is Hard , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.
[66] Luc Devroye,et al. Distribution-free performance bounds for potential function rules , 1979, IEEE Trans. Inf. Theory.
[67] J. Gibbs. Elementary Principles in Statistical Mechanics: Developed with Especial Reference to the Rational Foundation of Thermodynamics , 1902 .
[68] Manfred K. Warmuth,et al. Relating Data Compression and Learnability , 2003 .
[69] Gintare Karolina Dziugaite,et al. Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates , 2019, NeurIPS.
[70] Robert E. Schapire,et al. Efficient distribution-free learning of probabilistic concepts , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.
[71] Thomas Steinke,et al. Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.
[72] Maya R. Gupta,et al. Functional Bregman Divergence and Bayesian Estimation of Distributions , 2006, IEEE Transactions on Information Theory.
[73] Philip M. Long,et al. Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.
[74] Thomas Steinke,et al. Composable and versatile privacy via truncated CDP , 2018, STOC.
[75] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory A.
[76] Olivier Bousquet,et al. Sharper bounds for uniformly stable algorithms , 2019, COLT.
[77] Luc Devroye,et al. Vapnik-Chervonenkis Theory , 1996 .
[78] Michael Gastpar,et al. Robust Generalization via $\alpha$-Mutual Information , 2020, 2001.06399.
[79] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[80] Gintare Karolina Dziugaite,et al. In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors , 2020, ICML.
[81] Adam D. Smith,et al. Information, Privacy and Stability in Adaptive Data Analysis , 2017, ArXiv.
[82] Michael Gastpar,et al. Robust Generalization via $\alpha$-Mutual Information , 2020 .