Pattern Recognition and Machine Learning
暂无分享,去创建一个
[1] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .
[2] L. M. M.-T.. Theory of Probability , 1929, Nature.
[3] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .
[4] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[5] H. Jeffreys. An invariant form for the prior probability in estimation problems , 1946, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.
[6] L. Stein,et al. Probability and the Weighing of Evidence , 1950 .
[7] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.
[8] J. Blum. Multidimensional Stochastic Approximation Methods , 1954 .
[9] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[10] Robert G. Gallager,et al. Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.
[11] H. D. Block. The perceptron: a model for brain functioning. I , 1962 .
[12] Frank Rosenblatt,et al. PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .
[13] T. W. Anderson. ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS , 1963 .
[14] E. Nadaraya. On Estimating Regression , 1964 .
[15] G. S. Watson,et al. Smooth regression analysis , 1964 .
[16] S. M. Ali,et al. A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .
[17] I. Miller. Probability, Random Variables, and Stochastic Processes , 1966 .
[18] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.
[19] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.
[20] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .
[21] A. M. Walker. On the Asymptotic Behaviour of Posterior Distributions , 1969 .
[22] W. K. Hastings,et al. Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .
[23] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[24] Alan J. Mayne,et al. Generalized Inverse of Matrices and its Applications , 1972 .
[25] R. Mazo. On the theory of brownian motion , 1973 .
[26] G. C. Tiao,et al. Bayesian inference in statistical analysis , 1973 .
[27] Richard O. Duda,et al. Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.
[28] H. Akaike. A new look at the statistical model identification , 1974 .
[29] J. Besag. On Spatial-Temporal Models and Markov Fields , 1977 .
[30] 丸山 徹. Convex Analysisの二,三の進展について , 1977 .
[31] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .
[32] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[33] G. Schwarz. Estimating the Dimension of a Model , 1978 .
[34] A. Dawid. Conditional Independence for Statistical Operations , 1980 .
[35] J. Laurie Snell,et al. Markov Random Fields and Their Applications , 1980 .
[36] F. Krauss. Latent Structure Analysis , 1980 .
[37] V.W.S. Chan,et al. Principles of Digital Communication and Coding , 1979 .
[38] S. Adler. Over-relaxation method for the Monte Carlo evaluation of the partition function for multiquadratic actions , 1981 .
[39] Philip E. Gill,et al. Practical optimization , 1981 .
[40] Dorothy T. Thayer,et al. EM algorithms for ML factor analysis , 1982 .
[41] S. P. Lloyd,et al. Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.
[42] David Lindley. Scoring rules and the inevitability of probability , 1982 .
[43] Leslie G. Valiant,et al. A theory of the learnable , 1984, STOC '84.
[44] Brian Everitt,et al. An Introduction to Latent Variable Models , 1984 .
[45] Josef Kittler,et al. Contextual classification of multispectral pixel data , 1984, Image Vis. Comput..
[46] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[47] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .
[48] G. Wahba. A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem , 1985 .
[49] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[50] Jonathan D. Cryer,et al. Time Series Analysis , 1986, Encyclopedia of Big Data.
[51] J. Besag. On the Statistical Analysis of Dirty Pictures , 1986 .
[52] R. Hathaway. Another interpretation of the EM algorithm for mixture distributions , 1986 .
[53] Kin Hong Wong,et al. Script recognition using hidden Markov models , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[54] L. Sirovich. Turbulence and the dynamics of coherent structures. II. Symmetries and transformations , 1987 .
[55] S. Duane,et al. Hybrid Monte Carlo , 1987 .
[56] H. Reinhardt. Statistical Decision Theory and Bayesian Analysis. Second Edition (James O. Berger) , 1987 .
[57] M. J. D. Powell,et al. Radial basis functions for multivariable interpolation: a review , 1987 .
[58] Geoffrey J. McLachlan,et al. Mixture models : inference and applications to clustering , 1989 .
[59] David J. Spiegelhalter,et al. Local computations with probabilities on graphical structures and their application to expert systems , 1990 .
[60] James A. Anderson,et al. Neurocomputing: Foundations of Research , 1988 .
[61] M. Hodgson. Reducing the computational requirements of the minimum-distance classifier , 1988 .
[62] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[63] S. Ragazzini,et al. Learning of word stress in a sub-optimal second order back-propagation neural network , 1988, IEEE 1988 International Conference on Neural Networks.
[64] David S. Broomhead,et al. Multivariable Functional Interpolation and Adaptive Networks , 1988, Complex Syst..
[65] John Moody,et al. Fast Learning in Networks of Locally-Tuned Processing Units , 1989, Neural Computation.
[66] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.
[67] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[68] Ross D. Shachter,et al. Simulation Approaches to General Probabilistic Inference on Belief Networks , 2013, UAI.
[69] Stephen F. Gull,et al. Developments in Maximum Entropy Data Analysis , 1989 .
[70] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[71] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[72] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[73] Kuo-Chu Chang,et al. Weighing and Integrating Evidence for Stochastic Simulation in Bayesian Networks , 2013, UAI.
[74] Biing-Hwang Juang,et al. On the application of hidden Markov models for enhancing noisy speech , 1989, IEEE Trans. Acoust. Speech Signal Process..
[75] D. Greig,et al. Exact Maximum A Posteriori Estimation for Binary Images , 1989 .
[76] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[77] V. Ramasubramanian,et al. A generalized optimization of the K-d tree for fast nearest-neighbour search , 1989, Fourth IEEE Region 10 International Conference TENCON.
[78] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[79] H. White,et al. Universal approximation using feedforward networks with non-sigmoid hidden layer activation functions , 1989, International 1989 Joint Conference on Neural Networks.
[80] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[81] F. Girosi,et al. Networks for approximation and learning , 1990, Proc. IEEE.
[82] W S McCulloch,et al. A logical calculus of the ideas immanent in nervous activity , 1990, The Philosophy of Artificial Intelligence.
[83] Neil E. Cotter,et al. The Stone-Weierstrass theorem and its application to neural networks , 1990, IEEE Trans. Neural Networks.
[84] R. T. Cox. Probability, frequency and reasonable expectation , 1990 .
[85] Bernard Widrow,et al. 30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.
[86] Kohji Fukunaga,et al. Introduction to Statistical Pattern Recognition-Second Edition , 1990 .
[87] David J. Spiegelhalter,et al. Sequential updating of conditional probabilities on directed graphical structures , 1990, Networks.
[88] M. Frydenberg. The chain graph Markov property , 1990 .
[89] Shang-Liang Chen,et al. Orthogonal least squares learning algorithm for radial basis function networks , 1991, IEEE Trans. Neural Networks.
[90] Vladik Kreinovich,et al. Arbitrary nonlinearity is sufficient to represent all functions by neural networks: A theorem , 1991, Neural Networks.
[91] Christian Jutten,et al. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..
[92] Christopher M. Bishop,et al. A Fast Procedure for Retraining the Multilayer Perceptron , 1991, Int. J. Neural Syst..
[93] Yoshifusa Ito,et al. Representation of functions by superpositions of a step or sigmoid function and their applications to neural network theory , 1991, Neural Networks.
[94] Marlin A. Koschat,et al. Maximum Entropy Methods in Science and Engineering (Vol. 2) , 1991 .
[95] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[96] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[97] Geoffrey E. Hinton,et al. Adaptive Mixtures of Local Experts , 1991, Neural Computation.
[98] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[99] Jocelyn Sietsma,et al. Creating artificial neural networks that generalize , 1991, Neural Networks.
[100] J. Magnus,et al. Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .
[101] Yann LeCun,et al. Tangent Prop - A Formalism for Specifying Selected Invariances in an Adaptive Network , 1991, NIPS.
[102] Bernhard E. Boser,et al. A training algorithm for optimal margin classifiers , 1992, COLT '92.
[103] Chris Bishop,et al. Current address: Microsoft Research, , 2022 .
[104] O. Mangasarian,et al. Robust linear programming discrimination of two linearly inseparable sets , 1992 .
[105] David J. C. MacKay,et al. The Evidence Framework Applied to Classification Networks , 1992, Neural Computation.
[106] Yann LeCun,et al. Efficient Pattern Recognition Using a New Transformation Distance , 1992, NIPS.
[107] David J. C. MacKay,et al. Bayesian Interpolation , 1992, Neural Computation.
[108] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[109] S. Lauritzen. Propagation of Probabilities, Means, and Variances in Mixed Graphical Association Models , 1992 .
[110] David J. C. MacKay,et al. A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.
[111] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.
[112] Geoffrey E. Hinton,et al. Keeping the neural networks simple by minimizing the description length of the weights , 1993, COLT '93.
[113] A. Glavieux,et al. Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.
[114] Christopher M. Bishop,et al. Curvature-driven smoothing: a learning algorithm for feedforward networks , 1993, IEEE Trans. Neural Networks.
[115] Radford M. Neal. A new view of the EM algorithm that justifies incremental and other variants , 1993 .
[116] N. Gordon,et al. Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .
[117] Michael I. Jordan,et al. Supervised learning from incomplete data via an EM approach , 1993, NIPS.
[118] Robert Haining,et al. Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .
[119] Xiao-Li Meng,et al. Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .
[120] Joab R Winkler,et al. Numerical recipes in C: The art of scientific computing, second edition , 1993 .
[121] Hans L. Bodlaender,et al. A Tourist Guide through Treewidth , 1993, Acta Cybern..
[122] Robert Hecht-Nielsen,et al. On the Geometry of Feedforward Neural Network Error Surfaces , 1993, Neural Computation.
[123] C. Bishop,et al. Analysis of multiphase flows using dual-energy gamma densitometry and neural networks , 1993 .
[124] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[125] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[126] Robert A. Jacobs,et al. Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.
[127] Yoshua Bengio,et al. An Input Output HMM Architecture , 1994, NIPS.
[128] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[129] Christopher M. Bishop,et al. Novelty detection and neural network validation , 1994 .
[130] Andrew R. Webb,et al. Functional approximation by feed-forward networks: a least-squares approach to generalization , 1994, IEEE Trans. Neural Networks.
[131] Wray L. Buntine,et al. Computing second derivatives in feed-forward networks: a review , 1994, IEEE Trans. Neural Networks.
[132] John B. Moore,et al. Hidden Markov Models: Estimation and Control , 1994 .
[133] Umesh V. Vazirani,et al. An Introduction to Computational Learning Theory , 1994 .
[134] W. Gibbs,et al. Finite element methods , 2017, Graduate Studies in Mathematics.
[135] L. Tierney. Markov Chains for Exploring Posterior Distributions , 1994 .
[136] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[137] Uffe Kjærulff,et al. Blocking Gibbs sampling in very large probabilistic expert systems , 1995, Int. J. Hum. Comput. Stud..
[138] Stuart J. Russell,et al. Stochastic simulation algorithms for dynamic probabilistic networks , 1995, UAI.
[139] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.
[140] Persi Diaconis,et al. What do we know about the Metropolis algorithm? , 1995, STOC '95.
[141] J. Besag,et al. Bayesian Computation and Stochastic Systems , 1995 .
[142] Todd K. Leen,et al. From Data Distributions to Regularization in Invariant Learning , 1995, Neural Computation.
[143] J. E. Jackson. Statistical Factor Analysis and Related Methods: Theory and Applications , 1995 .
[144] Yoshua Bengio,et al. Pattern Recognition and Neural Networks , 1995 .
[145] Walter R. Gilks,et al. Adaptive rejection metropolis sampling , 1995 .
[146] Christopher M. Bishop,et al. Modelling conditional probability distributions for periodic variables , 1995 .
[147] Christopher M. Bishop,et al. EM Optimization of Latent-Variables Density Models , 1995, NIPS.
[148] Christopher M. Bishop,et al. Training with Noise is Equivalent to Tikhonov Regularization , 1995, Neural Computation.
[149] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..
[150] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[151] Michael Brady,et al. Novelty detection for the identification of masses in mammograms , 1995 .
[152] D. Mackay,et al. Bayesian neural networks and density networks , 1995 .
[153] Andrzej Cichocki,et al. A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.
[154] Geoffrey E. Hinton,et al. The EM algorithm for mixtures of factor analyzers , 1996 .
[155] Barak A. Pearlmutter,et al. Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA , 1996, NIPS.
[156] Mark Jerrum,et al. The Markov chain Monte Carlo method: an approach to approximate counting and integration , 1996 .
[157] P. M. Williams,et al. Using Neural Networks to Model Conditional Multivariate Densities , 1996, Neural Computation.
[158] Geoffrey E. Hinton,et al. Parameter estimation for linear dynamical systems , 1996 .
[159] David Barber,et al. Bayesian Model Comparison by Monte Carlo Chaining , 1996, NIPS.
[160] Christopher M. Bishop,et al. GTM: A Principled Alternative to the Self-Organizing Map , 1996, NIPS.
[161] Yoav Freund,et al. Experiments with a New Boosting Algorithm , 1996, ICML.
[162] R. Tibshirani. Regression Shrinkage and Selection via the Lasso , 1996 .
[163] H. Luetkepohl. The Handbook of Matrices , 1996 .
[164] Frederick Jelinek,et al. Statistical methods for speech recognition , 1997 .
[165] David Barber,et al. Ensemble Learning for Multi-Layer Networks , 1997, NIPS.
[166] Paul W. Goldberg,et al. Regression with Input-dependent Noise: A Gaussian Process Treatment , 1997, NIPS.
[167] Geoffrey E. Hinton,et al. Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .
[168] Geoffrey E. Hinton,et al. Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.
[169] Federico Girosi,et al. Support Vector Machines: Training and Applications , 1997 .
[170] D. Chakrabarti,et al. A fast fixed - point algorithm for independent component analysis , 1997 .
[171] Christopher K. I. Williams,et al. An upper bound on the Bayesian error bars for generalized linear regression , 1997 .
[172] Enrique F. Castillo,et al. Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.
[173] Huaiyu Zhu. On Information and Sufficiency , 1997 .
[174] Radford M. Neal. Monte Carlo Implementation of Gaussian Process Models for Bayesian Regression and Classification , 1997, physics/9701026.
[175] Brendan J. Frey,et al. A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.
[176] Christopher K. I. Williams,et al. Magnification factors for the GTM algorithm , 1997 .
[177] Sam T. Roweis,et al. EM Algorithms for PCA and SPCA , 1997, NIPS.
[178] Nanda Kambhatla,et al. Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.
[179] Tomaso A. Poggio,et al. Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[180] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[181] Christopher M. Bishop,et al. GTM: The Generative Topographic Mapping , 1998, Neural Computation.
[182] Jung-Fu Cheng,et al. Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..
[183] Christopher K. I. Williams. Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.
[184] Vladimir Vapnik,et al. Statistical learning theory , 1998 .
[185] Bernhard Schölkopf,et al. Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.
[186] Michael E. Tipping. Probabilistic Visualisation of High-Dimensional Binary Data , 1998, NIPS.
[187] Radford M. Neal,et al. Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation , 1995, Learning in Graphical Models.
[188] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[189] T. Ens,et al. Blind signal separation : statistical principles , 1998 .
[190] Brendan J. Frey,et al. Graphical Models for Machine Learning and Digital Communication , 1998 .
[191] J. C. BurgesChristopher. A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .
[192] Charles M. Bishop,et al. Ensemble learning in Bayesian neural networks , 1998 .
[193] Christopher M. Bishop,et al. Developments of the generative topographic mapping , 1998, Neurocomputing.
[194] Jason Weston,et al. Multi-Class Support Vector Machines , 1998 .
[195] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[196] Sadik Kapadia,et al. Discriminative Training of Hidden Markov Models , 1998 .
[197] Charles E. McCulloch,et al. The EM Algorithm and Its Extensions , 1998 .
[198] Christopher K. I. Williams. Computation with Infinite Neural Networks , 1998, Neural Computation.
[199] Christopher M. Bishop,et al. Bayesian PCA , 1998, NIPS.
[200] Stephen P. Brooks,et al. Markov chain Monte Carlo method and its application , 1998 .
[201] S. Mallat. A wavelet tour of signal processing , 1998 .
[202] Alexander J. Smola,et al. Learning with kernels , 1998 .
[203] Christopher M. Bishop,et al. A Hierarchical Latent Variable Model for Data Visualization , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[204] Michael I. Jordan. Graphical Models , 1998 .
[205] David Haussler,et al. Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.
[206] David Barber,et al. Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[207] David J. C. MacKay,et al. Good Error-Correcting Codes Based on Very Sparse Matrices , 1997, IEEE Trans. Inf. Theory.
[208] Volker Roth,et al. Nonlinear Discriminant Analysis Using Kernel Functions , 1999, NIPS.
[209] Hagai Attias,et al. Independent Factor Analysis , 1999, Neural Computation.
[210] Christopher M. Bishop,et al. Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.
[211] Zoubin Ghahramani,et al. A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.
[212] Olga Veksler,et al. Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[213] Nello Cristianini,et al. Large Margin DAGs for Multiclass Classification , 1999, NIPS.
[214] B. Scholkopf,et al. Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).
[215] Nello Cristianini,et al. Controlling the Sensitivity of Support Vector Machines , 1999 .
[216] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[217] Thomas Hofmann,et al. Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization , 1999, NIPS.
[218] David Haussler,et al. Convolution kernels on discrete structures , 1999 .
[219] Manfred Opper,et al. A Bayesian approach to on-line learning , 1999 .
[220] J. March. Introduction to the Calculus of Variations , 1999 .
[221] Hagai Attias,et al. Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.
[222] Eric W. Weisstein,et al. The CRC concise encyclopedia of mathematics , 1999 .
[223] Charles M. Bishop. Variational principal components , 1999 .
[224] Purushottam W. Laud,et al. Bayesian Nonparametric Inference for Random Distributions and Related Functions , 1999 .
[225] David J. Spiegelhalter,et al. Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.
[226] David J. C. MacKay,et al. Comparison of Approximate Methods for Handling Hyperparameters , 1999, Neural Computation.
[227] Zoubin Ghahramani,et al. Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.
[228] Michael E. Tipping,et al. Probabilistic Principal Component Analysis , 1999 .
[229] Daphne Koller,et al. Restricted Bayes Optimal Classifiers , 2000, AAAI/IAAI.
[230] Arthur E. Hoerl,et al. Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.
[231] Hoon Kim,et al. Monte Carlo Statistical Methods , 2000, Technometrics.
[232] Tom Minka,et al. Automatic Choice of Dimensionality for PCA , 2000, NIPS.
[233] Nello Cristianini,et al. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .
[234] Geoffrey E. Hinton,et al. Variational Learning for Switching State-Space Models , 2000, Neural Computation.
[235] David J. C. MacKay,et al. Variational Gaussian process classifiers , 2000, IEEE Trans. Neural Networks Learn. Syst..
[236] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[237] John A. Bather,et al. Decision Theory: An Introduction to Dynamic Programming and Sequential Decisions , 2000 .
[238] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[239] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.
[240] Alexander J. Smola,et al. Sparse Greedy Gaussian Process Regression , 2000, NIPS.
[241] G. Baudat,et al. Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.
[242] Tommi S. Jaakkola,et al. Tutorial on variational approximation methods , 2000 .
[243] Christopher K. I. Williams,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.
[244] P. Bartlett,et al. Probabilities for SV Machines , 2000 .
[245] David J.C. Mackay,et al. Density networks , 2000 .
[246] Yoram Singer,et al. Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..
[247] Michael I. Jordan,et al. Bayesian parameter estimation via variational methods , 2000, Stat. Comput..
[248] Christopher M. Bishop,et al. Non-linear Bayesian Image Modelling , 2000, ECCV.
[249] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.
[250] Tom Minka,et al. Expectation Propagation for approximate Bayesian inference , 2001, UAI.
[251] J. Friedman. Greedy function approximation: A gradient boosting machine. , 2001 .
[252] Sujit K. Ghosh,et al. Essential Wavelets for Statistical Applications and Data Analysis , 2001, Technometrics.
[253] Sanjoy Dasgupta,et al. A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.
[254] Adrian Corduneanu,et al. Variational Bayesian Model Selection for Mixture Distributions , 2001 .
[255] Paul Zarchan,et al. Fundamentals of Kalman Filtering: A Practical Approach , 2001 .
[256] Antony I. T. Rowstron,et al. Optimising Synchronisation Times for Mobile Devices , 2001, NIPS.
[257] Bernhard Schölkopf,et al. Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.
[258] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[259] Yee Whye Teh,et al. A New View of ICA , 2001 .
[260] T. Başar,et al. A New Approach to Linear Filtering and Prediction Problems , 2001 .
[261] George Eastman House,et al. Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .
[262] Stan Lipovetsky,et al. Latent Variable Models and Factor Analysis , 2001, Technometrics.
[263] Peter Tiño,et al. Using Directional Curvatures to Visualize Folding Patterns of the GTM Projection Manifolds , 2001, ICANN.
[264] W. Michael Conklin,et al. Monte Carlo Methods in Bayesian Computation , 2001, Technometrics.
[265] Michael E. Tipping,et al. Analysis of Sparse Bayesian Learning , 2001, NIPS.
[266] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[267] Ian T. Nabney,et al. Netlab: Algorithms for Pattern Recognition , 2002 .
[268] Eric R. Ziegel,et al. Generalized Linear Models , 2002, Technometrics.
[269] B. Rannala. Bioinformatics: The Machine Learning Approach.Second Edition. Adaptive Computation and Machine Learning. ByPierre Baldiand, Sørenv Brunak.A Bradford Book. Cambridge (Massachusetts): MIT Press. $49.95. xxiii + 452 p; ill.; index. ISBN: 0–262–02506‐X. 2001. , 2002 .
[270] David J. Spiegelhalter,et al. VIBES: A Variational Inference Engine for Bayesian Networks , 2002, NIPS.
[271] Tim Hesterberg,et al. Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.
[272] Ole Winther,et al. Mean-Field Approaches to Independent Component Analysis , 2002, Neural Computation.
[273] Lehel Csató,et al. Sparse On-Line Gaussian Processes , 2002, Neural Computation.
[274] Jean Ponce,et al. Computer Vision: A Modern Approach , 2002 .
[275] Peter Tiño,et al. Hierarchical GTM: Constructing Localized Nonlinear Projection Manifolds in a Principled Way , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[276] Tom Heskes,et al. Fractional Belief Propagation , 2002, NIPS.
[277] Christopher M. Bishop,et al. Bayesian Hierarchical Mixtures of Experts , 2002, UAI.
[278] Thore Graepel,et al. Solving Noisy Linear Operator Equations by Gaussian Processes: Application to Ordinary and Partial Differential Equations , 2003, ICML.
[279] Michael I. Jordan,et al. Hierarchical Bayesian Models for Applications in Information Retrieval , 2003 .
[280] Jong-Hoon Ahn,et al. A Constrained EM Algorithm for Principal Component Analysis , 2003, Neural Computation.
[281] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[282] Matthias W. Seeger,et al. Bayesian Gaussian process models : PAC-Bayesian generalisation error bounds and sparse approximations , 2003 .
[283] Radford M. Neal. Slice Sampling , 2003, The Annals of Statistics.
[284] Bernhard Schölkopf,et al. Learning to Find Pre-Images , 2003, NIPS.
[285] Michael I. Jordan,et al. Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[286] Charles Elkan,et al. Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.
[287] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.
[288] T. Speed,et al. Biological Sequence Analysis , 1998 .
[289] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[290] Michael E. Tipping,et al. Fast Marginal Likelihood Maximisation for Sparse Bayesian Models , 2003, AISTATS.
[291] Stephen J. Roberts,et al. Variational Mixture of Bayesian Independent Component Analyzers , 2003, Neural Computation.
[292] Terrence J. Sejnowski,et al. Variational Bayesian Learning of ICA with Missing Data , 2003, Neural Computation.
[293] Christopher K. I. Williams. Learning Kernel Classifiers , 2003 .
[294] David A. McAllester. PAC-Bayesian Stochastic Model Selection , 2003, Machine Learning.
[295] Fernando A. Quintana,et al. Nonparametric Bayesian data analysis , 2004 .
[296] James V. Stone. Independent Component Analysis: A Tutorial Introduction , 2007 .
[297] Vladimir Kolmogorov,et al. What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[298] Refik Soyer,et al. Bayesian Methods for Nonlinear Classification and Regression , 2004, Technometrics.
[299] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.
[300] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[301] Nando de Freitas,et al. An Introduction to MCMC for Machine Learning , 2004, Machine Learning.
[302] Volker Tresp,et al. Scaling Kernel-Based Systems to Large Data Sets , 2001, Data Mining and Knowledge Discovery.
[303] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.
[304] Bo Thiesson,et al. ARMA Time-Series Modeling with Graphical Models , 2004, UAI.
[305] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 2001 .
[306] T. Minka. Power EP , 2004 .
[307] G. Wahba,et al. Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2003 .
[308] Michael Isard,et al. CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.
[309] Klaus Schulten,et al. Self-organizing maps: ordering, convergence properties and energy functions , 1992, Biological Cybernetics.
[310] Paul A. Viola,et al. Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.
[311] Larry Wasserman,et al. All of Statistics: A Concise Course in Statistical Inference , 2004 .
[312] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.
[313] Nir Friedman,et al. Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.
[314] Christopher M. Bishop,et al. Robust Bayesian Mixture Modelling , 2005, ESANN.
[315] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[316] Christopher M. Bishop,et al. Distinguishing text from graphics in on-line handwritten ink , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.
[317] H. Bourlard,et al. Auto-association by multilayer perceptrons and singular value decomposition , 1988, Biological Cybernetics.
[318] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[319] M. Tribus,et al. Probability theory: the logic of science , 2003 .
[320] Stephen J. Roberts,et al. An Anthology of Probabilistic Models for Medical Informatics , 2005 .
[321] Andrew Blake,et al. Sparse Bayesian learning for efficient visual tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[322] Carl E. Rasmussen,et al. Assessing Approximations for Gaussian Process Classification , 2005, NIPS.
[323] Charles M. Bishop,et al. Variational Message Passing , 2005, J. Mach. Learn. Res..
[324] A. Rollett,et al. The Monte Carlo Method , 2004 .
[325] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[326] Thomas P. Minka,et al. Divergence measures and message passing , 2005 .
[327] Martin J. Wainwright,et al. A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.
[328] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .
[329] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[330] Tony Jebara,et al. Machine learning: Discriminative and generative , 2006 .
[331] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[332] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .
[333] Tom Minka,et al. Principled Hybrids of Generative and Discriminative Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[334] Murray A. Jorgensen. Iteratively Reweighted Least Squares , 2006 .
[335] H. Robbins. A Stochastic Approximation Method , 1951 .
[336] M. V. Velzen,et al. Self-organizing maps , 2007 .
[337] T. Hastie,et al. Principal Curves , 2007 .
[338] David Hinkley,et al. Bootstrap Methods: Another Look at the Jackknife , 2008 .
[339] Lakhmi C. Jain,et al. Introduction to Bayesian Networks , 2008 .
[340] Sunita Sarawagi. Learning with Graphical Models , 2008 .
[341] P. Deb. Finite Mixture Models , 2008 .
[342] Iain Murray,et al. Introduction To Gaussian Processes , 2008 .
[343] S. E. Ahmed,et al. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian Inference , 2008, Technometrics.
[344] Jan de Leeuw,et al. Journal of Statistical Software , 2009 .
[345] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[346] R. Shah,et al. Least Squares Support Vector Machines , 2022 .
[347] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .