Learning transferable representations
暂无分享,去创建一个
[1] Sebastian Thrun,et al. Learning to Learn , 1998, Springer US.
[2] Ido Dagan,et al. The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Dileep George,et al. Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.
[5] Gemma Boleda,et al. Inclusive yet Selective: Supervised Distributional Hypernymy Detection , 2014, COLING.
[6] Caroline Uhler,et al. Maximum likelihood estimation for linear Gaussian covariance models , 2014, 1408.5604.
[7] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[8] Franz H Messerli,et al. Chocolate consumption, cognitive function, and Nobel laureates. , 2012, The New England journal of medicine.
[9] Tyler Lu,et al. Impossibility Theorems for Domain Adaptation , 2010, AISTATS.
[10] H. Shimodaira,et al. Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .
[11] Bernhard Schölkopf,et al. Discovering Causal Signals in Images , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[12] Andrew Gelman,et al. The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo , 2011, J. Mach. Learn. Res..
[13] David J. Weir,et al. Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.
[14] Motoaki Kawanabe,et al. Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.
[15] Bernhard Schölkopf,et al. Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..
[16] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[17] Bernhard Schölkopf,et al. Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.
[18] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[19] Leon A. Gatys,et al. Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Bernhard Schölkopf,et al. A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..
[21] Bernhard Schölkopf,et al. Causal Inference Using the Algorithmic Markov Condition , 2008, IEEE Transactions on Information Theory.
[22] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.
[23] Christina Heinze-Deml,et al. Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.
[24] Bogdan Gabrys,et al. Metalearning: a survey of trends and technologies , 2013, Artificial Intelligence Review.
[25] R. French. Catastrophic Forgetting in Connectionist Networks , 2006 .
[26] Bernhard Schölkopf,et al. Learning Independent Causal Mechanisms , 2017, ICML.
[27] Yoshua Bengio,et al. Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.
[28] Qin Lu,et al. Chasing Hypernyms in Vector Spaces with Entropy , 2014, EACL.
[29] A. Atiya,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.
[30] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.
[31] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[32] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .
[33] Wei Shen,et al. Few-Shot Image Recognition by Predicting Parameters from Activations , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[35] Jessica B. Hamrick,et al. Simulation as an engine of physical scene understanding , 2013, Proceedings of the National Academy of Sciences.
[36] Bharath Hariharan,et al. Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[37] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[38] H. Levene. Robust tests for equality of variances , 1961 .
[39] D. W. Scott,et al. Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .
[40] Kilian Q. Weinberger,et al. On Calibration of Modern Neural Networks , 2017, ICML.
[41] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.
[42] Paul R Cohen,et al. DARPA's Big Mechanism program , 2015, Physical biology.
[43] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[44] Ido Dagan,et al. Directional distributional similarity for lexical inference , 2010, Natural Language Engineering.
[45] Bernhard Schölkopf,et al. Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..
[46] Bernhard Schölkopf,et al. Avoiding Discrimination through Causal Reasoning , 2017, NIPS.
[47] Bernhard Schölkopf,et al. Nonlinear causal discovery with additive noise models , 2008, NIPS.
[48] Bernhard Schölkopf,et al. Correcting Sample Selection Bias by Unlabeled Data , 2006, NIPS.
[49] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.
[50] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[51] Tom Heskes,et al. Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..
[52] Joshua B. Tenenbaum,et al. Human-level concept learning through probabilistic program induction , 2015, Science.
[53] Bernhard Schölkopf,et al. Discriminative k-shot learning using probabilistic models , 2017, ArXiv.
[54] Alessandro Lenci,et al. Identifying hypernyms in distributional semantic spaces , 2012, *SEMEVAL.
[55] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.
[56] J. Hammersley,et al. Monte Carlo Methods , 1965 .
[57] Kira Radinsky,et al. Learning causality for news events prediction , 2012, WWW.
[58] Bernhard Schölkopf,et al. Invariant Models for Causal Transfer Learning , 2015, J. Mach. Learn. Res..
[59] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[60] David J. Weir,et al. A General Framework for Distributional Similarity , 2003, EMNLP.
[61] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[62] Steffen Bickel,et al. Discriminative learning for differing training and test distributions , 2007, ICML '07.
[63] David Lopez-Paz,et al. Revisiting Classifier Two-Sample Tests , 2016, ICLR.
[64] Bernhard Schölkopf,et al. Towards a Learning Theory of Causation , 2015, 1502.02398.
[65] Jong-Hoon Oh,et al. Excitatory or Inhibitory: A New Semantic Orientation Extracts Contradiction and Causality from the Web , 2012, EMNLP.
[66] Le Song,et al. A Kernel Statistical Test of Independence , 2007, NIPS.
[67] Omer Levy,et al. A Simple Word Embedding Model for Lexical Substitution , 2015, VS@HLT-NAACL.
[68] Frank Hutter,et al. Initializing Bayesian Hyperparameter Optimization via Meta-Learning , 2015, AAAI.
[69] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[70] Martial Hebert,et al. Low-Shot Learning from Imaginary Data , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[71] Le Song,et al. A Hilbert Space Embedding for Distributions , 2007, Discovery Science.
[72] Thomas L. Griffiths,et al. The Indian Buffet Process: An Introduction and Review , 2011, J. Mach. Learn. Res..
[73] Bernhard Schölkopf,et al. Domain Generalization via Invariant Feature Representation , 2013, ICML.
[74] Paramita Mirza,et al. CATENA: CAusal and TEmporal relation extraction from NAtural language texts , 2016, COLING.
[75] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[76] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[77] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[78] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[79] Bernhard Schölkopf,et al. Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .
[80] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[81] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[82] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .
[83] Zoubin Ghahramani,et al. One-Shot Learning in Discriminative Neural Networks , 2017, ArXiv.
[84] Massimiliano Pontil,et al. Multi-Task Feature Learning , 2006, NIPS.
[85] D. Rubin,et al. Statistical Analysis with Missing Data , 1988 .
[86] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[87] Magnus Sahlgren,et al. The Distributional Hypothesis , 2008 .
[88] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[89] David J. Weir,et al. Characterising Measures of Lexical Distributional Similarity , 2004, COLING.
[90] Mehdi M. Kashani,et al. Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors , 2014, Cell.
[91] Nitish Srivastava,et al. Discriminative Transfer Learning with Tree-based Priors , 2013, NIPS.
[92] Quinn Jones,et al. Few-Shot Adversarial Domain Adaptation , 2017, NIPS.
[93] Gregory R. Koch,et al. Siamese Neural Networks for One-Shot Image Recognition , 2015 .
[94] D. Weed. On the logic of causal inference. , 1986, American journal of epidemiology.
[95] Preslav Nakov,et al. Classification of semantic relations between nominals , 2009, Lang. Resour. Evaluation.
[96] Bernhard Schölkopf,et al. On causal and anticausal learning , 2012, ICML.
[97] L. Elton,et al. THE DIRECTION OF TIME , 1978 .
[98] Raffaella Bernardi,et al. Entailment above the word level in distributional semantics , 2012, EACL.
[99] Daniel Thalmann,et al. Autonomy , 2005, SIGGRAPH Courses.
[100] Bernhard Schölkopf,et al. Distinguishing Cause from Effect Based on Exogeneity , 2015, ArXiv.
[101] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[102] David Lopez-Paz,et al. Causal Discovery Using Proxy Variables , 2017, ICLR.
[103] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[104] Dan I. Moldovan,et al. Causal Relation Extraction , 2008, LREC.
[105] 秀俊 松井,et al. Statistics for High-Dimensional Data: Methods, Theory and Applications , 2014 .
[106] Kevin P. Murphy,et al. Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.
[107] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[108] Andre Cohen,et al. An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.
[109] Kilian Q. Weinberger,et al. Marginalized Denoising Autoencoders for Domain Adaptation , 2012, ICML.
[110] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[111] Thomas G. Dietterich,et al. To transfer or not to transfer , 2005, NIPS 2005.
[112] R. Harald Baayen,et al. Word Frequency Distributions , 2001 .
[113] Jonas Peters,et al. Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.
[114] Bernhard Schölkopf,et al. Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.
[115] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..