Bayesian Analysis in Natural Language Processing, Second Edition
暂无分享,去创建一个
[1] Hal Daumé,et al. Non-Parametric Bayesian Areal Linguistics , 2009, HLT-NAACL.
[2] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[3] Cosma Rohilla Shalizi,et al. Philosophy and the practice of Bayesian statistics. , 2010, The British journal of mathematical and statistical psychology.
[4] Nikolaos V. Sahinidis,et al. Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..
[5] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[6] Samy Bengio,et al. Torch: a modular machine learning software library , 2002 .
[7] T. Griffiths,et al. A Bayesian framework for word segmentation: Exploring the effects of context , 2009, Cognition.
[8] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[9] Mirella Lapata,et al. Vector-based Models of Semantic Composition , 2008, ACL.
[10] Gholamreza Haffari,et al. Structured Prediction of Sequences and Trees Using Infinite Contexts , 2015, ECML/PKDD.
[11] Tadao Kasami,et al. An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .
[12] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..
[13] Alex Graves,et al. Supervised Sequence Labelling , 2012 .
[14] O. Cappé,et al. On‐line expectation–maximization algorithm for latent data models , 2009 .
[15] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[16] David J. Weir,et al. Characterizing Structural Descriptions Produced by Various Grammatical Formalisms , 1987, ACL.
[17] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[18] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.
[19] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[20] Christoph Goller,et al. Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).
[21] Ben O'Neill,et al. Exchangeability, Correlation, and Bayes' Effect , 2009 .
[22] Thomas L. Griffiths,et al. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.
[23] Thomas L. Griffiths,et al. Infinite latent feature models and the Indian buffet process , 2005, NIPS.
[24] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .
[25] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[26] Matt Post,et al. Bayesian Learning of a Tree Substitution Grammar , 2009, ACL.
[27] Noah A. Smith,et al. Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language , 2005, HLT.
[28] M. Escobar,et al. Bayesian Density Estimation and Inference Using Mixtures , 1995 .
[29] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[30] Yee Whye Teh,et al. Beam sampling for the infinite hidden Markov model , 2008, ICML '08.
[31] Jay Earley,et al. An efficient context-free parsing algorithm , 1970, Commun. ACM.
[32] M. Steedman,et al. Combinatory Categorial Grammar , 2011 .
[33] Thomas Hofmann,et al. Gaussian process classification for segmenting and annotating sequences , 2004, ICML.
[34] Regina Barzilay,et al. Bayesian Unsupervised Topic Segmentation , 2008, EMNLP.
[35] Joshua Goodman,et al. Parsing Algorithms and Metrics , 1996, ACL.
[36] Christopher D. Manning,et al. Hierarchical Bayesian Domain Adaptation , 2009, NAACL.
[37] R. Rosenfeld,et al. Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.
[38] Regina Barzilay,et al. Adding More Languages Improves Unsupervised Multilingual Part-of-Speech Tagging: a Bayesian Non-Parametric Approach , 2009, NAACL.
[39] Jun'ichi Tsujii,et al. Probabilistic CFG with Latent Annotations , 2005, ACL.
[40] Dan Klein,et al. Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.
[41] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[42] Shin Ishii,et al. On-line EM Algorithm for the Normalized Gaussian Network , 2000, Neural Computation.
[43] Yonatan Bisk,et al. An HDP Model for Inducing Combinatory Categorial Grammars , 2013, TACL.
[44] Dan Roth,et al. Integer linear programming inference for conditional random fields , 2005, ICML.
[45] Noah A. Smith,et al. Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction , 2009, NAACL.
[46] S. Fienberg. Bayesian Models and Methods in Public Policy and Government Settings , 2011, 1108.2177.
[47] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Charles Kemp,et al. How to Grow a Mind: Statistics, Structure, and Abstraction , 2011, Science.
[49] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[50] David R. Karger,et al. Content Modeling Using Latent Permutations , 2009, J. Artif. Intell. Res..
[51] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[52] Detlef Prescher,et al. Head-Driven PCFGs with Latent-Head Statistics , 2005, IWPT.
[53] Noah A. Smith,et al. Parsing with Soft and Hard Constraints on Dependency Length , 2005 .
[54] Shay B. Cohen,et al. Online Adaptor Grammars with Hybrid Inference , 2014, Transactions of the Association for Computational Linguistics.
[55] John DeNero,et al. Sampling Alignment Structure under a Bayesian Translation Model , 2008, EMNLP.
[56] Charles Kemp,et al. Bayesian models of cognition , 2008 .
[57] Fernando Pereira,et al. Relating Probabilistic Grammars and Automata , 1999, ACL.
[58] B. D. Finetti,et al. Foresight: Its Logical Laws, Its Subjective Sources , 1992 .
[59] Markus Dreyer,et al. Better Informed Training of Latent Syntactic Features , 2006, EMNLP.
[60] Dan Klein,et al. Online EM for Unsupervised Models , 2009, NAACL.
[61] Hanna M. Wallach,et al. Topic modeling: beyond bag-of-words , 2006, ICML.
[62] Jason Eisner,et al. Transformational Priors Over Grammars , 2002, EMNLP.
[63] Chris Dyer,et al. A Gibbs Sampler for Phrasal Synchronous Grammar Induction , 2009, ACL.
[64] Regina Barzilay,et al. Unsupervised Multilingual Grammar Induction , 2009, ACL.
[65] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.
[66] Mark Johnson,et al. Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars , 2014, TACL.
[67] Ralph Grishman,et al. A Procedure for Quantitatively Comparing the Syntactic Coverage of English Grammars , 1991, HLT.
[68] James Henderson,et al. Inducing History Representations for Broad Coverage Statistical Parsing , 2003, NAACL.
[69] Andreas Stolcke,et al. Inducing Probabilistic Grammars by Bayesian Model Merging , 1994, ICGI.
[70] Daniel H. Younger,et al. Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..
[71] Yee Whye Teh,et al. A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.
[72] Yoshua Bengio,et al. Neural Probabilistic Language Models , 2006 .
[73] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[74] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[75] Shankar Kumar,et al. Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.
[76] Shankar Kumar,et al. Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.
[77] Thomas L. Griffiths,et al. Contextual Dependencies in Unsupervised Word Segmentation , 2006, ACL.
[78] Thomas L. Griffiths,et al. Probabilistic Topic Models , 2007 .
[79] J. Tenenbaum,et al. A tutorial introduction to Bayesian models of cognitive development , 2011, Cognition.
[80] J. Tenenbaum,et al. Probabilistic models of cognition: exploring representations and inductive biases , 2010, Trends in Cognitive Sciences.
[81] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[82] Regina Barzilay,et al. Unsupervised Multilingual Learning for POS Tagging , 2008, EMNLP.
[83] Matt Post,et al. Bayesian Tree Substitution Grammars as a Usage-based Approach , 2013, Language and speech.
[84] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[85] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[86] Mikio Yamamoto,et al. Dirichlet mixtures in text modeling , 2005 .
[87] Dan Klein,et al. Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency , 2004, ACL.
[88] Laura Kallmeyer,et al. Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems , 2010, COLING.
[89] Jeffrey L. Elman,et al. Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..
[90] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[91] John Darlington,et al. A Transformation System for Developing Recursive Programs , 1977, J. ACM.
[92] Catherine L. Harris,et al. Connectionism and Cognitive Linguistics , 1990 .
[93] Yee Whye Teh,et al. A stochastic memoizer for sequence data , 2009, ICML '09.
[94] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[95] Zellig S. Harris,et al. Distributional Structure , 1954 .
[96] R. T. Cox. Probability, frequency and reasonable expectation , 1990 .
[97] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..
[98] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[99] Jianfeng Gao,et al. A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers , 2008, EMNLP.
[100] Yee Whye Teh,et al. Consistency and Fluctuations For Stochastic Gradient Langevin Dynamics , 2014, J. Mach. Learn. Res..
[101] Aravind K. Joshi,et al. Tree-Adjoining Grammars , 1997, Handbook of Formal Languages.
[102] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1992, Math. Control. Signals Syst..
[103] Mark Johnson,et al. Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars , 2009, NAACL.
[104] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[105] Stanley F. Chen,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[106] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[107] Mikel L. Forcada,et al. Asynchronous translations with recurrent neural nets , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).
[108] Graeme Hirst,et al. Bayesian Analysis in Natural Language Processing , 2016, Computational Linguistics.
[109] Michael I. Jordan,et al. Variational methods for the Dirichlet process , 2004, ICML.