Prediction with a short memory
暂无分享,去创建一个
Vatsal Sharan | Sham M. Kakade | Percy Liang | Gregory Valiant | S. Kakade | Percy Liang | Vatsal Sharan | G. Valiant
[1] Abraham Lempel,et al. Compression of individual sequences via variable-rate coding , 1978, IEEE Trans. Inf. Theory.
[2] A. P. Dawid,et al. Present position and potential developments: some personal views , 1984 .
[3] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[4] D. Freedman,et al. On the consistency of Bayes estimates , 1986 .
[5] Andrew R. Barron,et al. Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.
[6] Dean Phillips Foster. Prediction in the Worst Case , 1991 .
[7] B. McNaughton,et al. Reactivation of hippocampal ensemble memories during sleep. , 1994, Science.
[8] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[9] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[10] Stanley F. Chen,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] D. Haussler,et al. MUTUAL INFORMATION, METRIC ENTROPY AND CUMULATIVE RELATIVE ENTROPY RISK , 1997 .
[13] Avner Friedman. The Mathematics of Information Coding, Extraction and Distribution. , 1997 .
[14] Tj Sweeting,et al. Invited discussion of A. R. Barron: Information-theoretic characterization of Bayes performance and the choice of priors in parametric and nonparametric problems , 1998 .
[15] Jorma Rissanen,et al. The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.
[16] L. Wasserman,et al. The consistency of posterior distributions in nonparametric problems , 1999 .
[17] D. Haussler,et al. Worst Case Prediction over Sequences under Log Loss , 1999 .
[18] Y. Shtarkov. AIM FUNCTIONS AND SEQUENTIAL ESTIMATION OF THE SOURCE MODEL FOR UNIVERSAL CODING , 1999 .
[19] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[20] V. Vovk. Competitive On‐line Statistics , 2001 .
[21] Tong Zhang,et al. Learning Bounds for a Generalized Family of Bayesian Posterior Distributions , 2003, NIPS.
[22] Adam Tauman Kalai,et al. Noise-tolerant learning, the parity problem, and the statistical query model , 2000, STOC '00.
[23] Manfred K. Warmuth,et al. Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions , 1999, Machine Learning.
[24] Sham M. Kakade,et al. Online Bounds for Bayesian Algorithms , 2004, NIPS.
[25] Nicolò Cesa-Bianchi,et al. Worst-Case Bounds for the Logarithmic Loss of Predictors , 1999, Machine Learning.
[26] Ronald,et al. Learning representations by backpropagating errors , 2004 .
[27] Peter Grünwald,et al. A tutorial introduction to the minimum description length principle , 2004, ArXiv.
[28] G. Miller. Learning to Forget , 2004, Science.
[29] Sham M. Kakade,et al. Worst-Case Bounds for Gaussian Process Models , 2005, NIPS.
[30] Elchanan Mossel,et al. Learning nonsingular phylogenies and hidden Markov models , 2005, STOC '05.
[31] Rocco A. Servedio,et al. Agnostically learning halfspaces , 2005, 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05).
[32] Gábor Lugosi,et al. Prediction, learning, and games , 2006 .
[33] Sham M. Kakade,et al. A spectral algorithm for learning Hidden Markov Models , 2008, J. Comput. Syst. Sci..
[34] Ryan O'Donnell,et al. Polynomial regression under arbitrary product distributions , 2010, Machine Learning.
[35] Jaikumar Radhakrishnan,et al. The Communication Complexity of Correlation , 2007, IEEE Transactions on Information Theory.
[36] Madhur Tulsiani,et al. SDP Gaps from Pairwise Independence , 2012, Theory Comput..
[37] Anima Anandkumar,et al. A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.
[38] T. Sanders. Analysis of Boolean Functions , 2012, ArXiv.
[39] Prasad Raghavendra,et al. Approximate Constraint Satisfaction Requires Large LP Relaxations , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[40] L. Reyzin,et al. Statistical algorithms and a lower bound for detecting planted cliques , 2012, STOC '13.
[41] Madhur Tulsiani,et al. LS+ Lower Bounds from Pairwise Independence , 2013, 2013 IEEE Conference on Computational Complexity.
[42] Aditya Bhaskara,et al. Provable Bounds for Learning Some Deep Representations , 2013, ICML.
[43] Ryan O'Donnell,et al. Analysis of Boolean Functions , 2014, ArXiv.
[44] Santosh S. Vempala,et al. University of Birmingham On the Complexity of Random Satisfiability Problems with Planted Solutions , 2018 .
[45] David Witmer,et al. Goldreich's PRG: Evidence for Near-Optimal Polynomial Stretch , 2014, 2014 IEEE 29th Conference on Computational Complexity (CCC).
[46] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[47] Ryan O'Donnell,et al. How to Refute a Random CSP , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.
[48] Jason Weston,et al. Memory Networks , 2014, ICLR.
[49] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[50] Pravesh Kothari,et al. Sum of Squares Lower Bounds from Pairwise Independence , 2015, STOC.
[51] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[52] Ryuhei Mori,et al. Lower bounds for CSP refutation by SDP hierarchies , 2016, APPROX-RANDOM.
[53] Amit Daniely,et al. Complexity Theoretic Limitations on Learning DNF's , 2014, COLT.
[54] Amit Daniely,et al. Complexity theoretic limitations on learning halfspaces , 2015, STOC.
[55] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[56] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[57] Anima Anandkumar,et al. Training Input-Output Recurrent Neural Networks through Spectral Methods , 2016, ArXiv.
[58] M. Wilson,et al. Uncovering representations of sleep-associated hippocampal ensemble spike activity , 2016, Scientific Reports.
[59] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[60] Benjamin Van Roy,et al. An Information-Theoretic Analysis of Thompson Sampling , 2014, J. Mach. Learn. Res..
[61] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[62] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[63] Matthew A. Wilson,et al. Deciphering Neural Codes of Memory during Sleep , 2017, Trends in Neurosciences.
[64] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[65] Ryan O'Donnell,et al. Sum of squares lower bounds for refuting any CSP , 2017, STOC.