Pattern Recognition and Neural Networks

From the Publisher: Pattern recognition has long been studied in relation to many different (and mainly unrelated) applications, such as remote sensing, computer vision, space research, and medical imaging. In this book Professor Ripley brings together two crucial ideas in pattern recognition; statistical methods and machine learning via neural networks. Unifying principles are brought to the fore, and the author gives an overview of the state of the subject. Many examples are included to illustrate real problems in pattern recognition and how to overcome them.This is a self-contained account, ideal both as an introduction for non-specialists readers, and also as a handbook for the more expert reader.

[1]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[2]  S. Muroga,et al.  Theory of majority decision elements , 1961 .

[3]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[4]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[5]  Saburo Muroga,et al.  Lower Bounds of the Number of Threshold Functions and a Maximum Weight , 1962, IEEE Trans. Electron. Comput..

[6]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[7]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[8]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[9]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[10]  N. Campbell,et al.  A multivariate study of variation in two species of rock crab of the genus Leptograpsus , 1974 .

[11]  John Aitchison,et al.  Statistical Prediction Analysis , 1975 .

[12]  M. Stone,et al.  Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[13]  Leslie G. Valiant,et al.  Fast probabilistic algorithms for hamiltonian circuits and matchings , 1977, STOC '77.

[14]  Keinosuke Fukunaga,et al.  The optimal distance measure for nearest neighbor classification , 1981, IEEE Trans. Inf. Theory.

[15]  David J. Hand,et al.  Kernel Discriminant Analysis , 1983 .

[16]  Keinosuke Fukunaga,et al.  An Optimal Global Nearest Neighbor Metric , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Josef Kittler,et al.  Pattern Recognition Theory and Applications , 1987, NATO ASI Series.

[18]  Chris Carter,et al.  Assessing Credit Card Applications Using Machine Learning , 1987, IEEE Expert.

[19]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[20]  Paul E. Utgoff,et al.  Perceptron Trees : A Case Study in ybrid Concept epresentations , 1999 .

[21]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[22]  Marvin Minsky,et al.  Perceptrons: expanded edition , 1988 .

[23]  John Scott Bridle,et al.  Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.

[24]  P. A. Chou,et al.  Recognition of Equations Using a Two-Dimensional Stochastic Context-Free Grammar , 1989, Other Conferences.

[25]  Christian Lebiere,et al.  The Cascade-Correlation Learning Architecture , 1989, NIPS.

[26]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[27]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[28]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[29]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[30]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[31]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[32]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[33]  Donald F. Specht,et al.  Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification , 1990, IEEE Trans. Neural Networks.

[34]  Stephen I. Gallant,et al.  Perceptron-based learning algorithms , 1990, IEEE Trans. Neural Networks.

[35]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[36]  Stephen Cox,et al.  RecNorm: Simultaneous Normalisation and Classification Applied to Speech Recognition , 1990, NIPS.

[37]  Seymour Shlien,et al.  Multiple binary decision tree classifiers , 1990, Pattern Recognit..

[38]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[39]  Christopher G. Atkeson,et al.  Using locally weighted regression for robot learning , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[40]  Farid U. Dowla,et al.  Backpropagation Learning for Multilayer Feed-Forward Neural Networks Using the Conjugate Gradient Method , 1991, Int. J. Neural Syst..

[41]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[42]  Francesco Palmieri,et al.  Sound localization with a neural network trained with the multiple extended Kalman algorithm , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[43]  Etienne Barnard,et al.  Optimization for training neural nets , 1992, IEEE Trans. Neural Networks.

[44]  L. Cooper,et al.  When Networks Disagree: Ensemble Methods for Hybrid Neural Networks , 1992 .

[45]  G. Pflug,et al.  Stochastic approximation and optimization of random systems , 1992 .

[46]  Roberto Battiti,et al.  First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.

[47]  Adam Krzyżak,et al.  Methods of combining multiple classifiers and their applications to handwriting recognition , 1992, IEEE Trans. Syst. Man Cybern..

[48]  Russell Reed,et al.  Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.

[49]  J. Parrondo,et al.  Vapnik-Chervonenkis bounds for generalization , 1993 .

[50]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[51]  Andrzej Cichocki,et al.  Neural networks for optimization and signal processing , 1993 .

[52]  John Shawe-Taylor,et al.  A Result of Vapnik with Applications , 1993, Discret. Appl. Math..

[53]  Frans M. J. Willems,et al.  Context Tree Weighting : A Sequential Universal Source Coding Procedure for Fsmx Sources , 1993, Proceedings. IEEE International Symposium on Information Theory.

[54]  Rama Chellappa,et al.  Comparative Performance of Classification Methods for Fingerprints | NIST , 1993 .

[55]  Harris Drucker,et al.  Boosting Performance in Neural Networks , 1993, Int. J. Pattern Recognit. Artif. Intell..

[56]  Simon Kasif,et al.  OC1: A Randomized Induction of Oblique Decision Trees , 1993, AAAI.

[57]  A. Sakurai,et al.  Tighter bounds of the VC-dimension of three layer networks , 1993 .

[58]  Gregory J. Wolff,et al.  Optimal Brain Surgeon and general network pruning , 1993, IEEE International Conference on Neural Networks.

[59]  Brian D. Ripley,et al.  Statistical aspects of neural networks , 1993 .

[60]  Ian Parberry,et al.  Circuit complexity and neural networks , 1994 .

[61]  Roy L. Streit,et al.  Maximum likelihood training of probabilistic neural networks , 1994, IEEE Trans. Neural Networks.

[62]  Brian D. Ripley,et al.  Neural networks and flexible regression and discrimination , 1994 .

[63]  Jürgen Schmidhuber,et al.  Simplifying Neural Nets by Discovering Flat Minima , 1994, NIPS.

[64]  Harris Drucker,et al.  Boosting and Other Ensemble Methods , 1994, Neural Computation.

[65]  Wolfgang Maass,et al.  How fast can a threshold gate learn , 1994, COLT 1994.

[66]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[67]  A. Refenes Neural Networks in the Capital Markets , 1994 .

[68]  Brian D. Ripley,et al.  Flexible Non-linear Approaches to Classification , 1994 .

[69]  Pat Langley,et al.  Elements of Machine Learning , 1995 .

[70]  Alice J. O'Toole,et al.  Connectionist models of face processing: A survey , 1994, Pattern Recognit..

[71]  Eric Horvitz,et al.  Structure and chance: melding logic and probability for software debugging , 1995, CACM.

[72]  Marek Karpinski,et al.  Bounding VC-dimension of neural networks: Progress and prospects , 1995, EuroCOLT.

[73]  Ivan Bratko,et al.  Applications of inductive logic programming , 1995, SGAR.

[74]  Jude W. Shavlik,et al.  in Advances in Neural Information Processing , 1996 .

[75]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[76]  Peter Auer,et al.  Theory and Applications of Agnostic PAC-Learning with Small Decision Trees , 1995, ICML.

[77]  Marek Karpinski,et al.  Polynomial bounds for VC dimension of sigmoidal neural networks , 1995, STOC '95.

[78]  Frans M. J. Willems,et al.  The context-tree weighting method: basic properties , 1995, IEEE Trans. Inf. Theory.

[79]  David Heckerman,et al.  Decision-theoretic troubleshooting , 1995, CACM.

[80]  Brian D. Ripley,et al.  Statistical Ideas for Selecting Network Architectures , 1995, SNN Symposium on Neural Networks.

[81]  Michael P. Wellman,et al.  Bayesian networks , 1995, CACM.

[82]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[83]  G. Wahba,et al.  Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy : the 1994 Neyman Memorial Lecture , 1995 .

[84]  Hans Ulrich Simon,et al.  Robust Trainability of Single Neurons , 1995, J. Comput. Syst. Sci..

[85]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.

[86]  Simon Kasif,et al.  Efficient Algorithms for Finding Multi-way Splits for Decision Trees , 1995, ICML.

[87]  Corinna Cortes,et al.  Boosting Decision Trees , 1995, NIPS.

[88]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[89]  Herbert A. Simon,et al.  Applications of machine learning and rule induction , 1995, CACM.

[90]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[91]  Robert E. Schapire,et al.  Predicting Nearly as Well as the Best Pruning of a Decision Tree , 1995, COLT.

[92]  Robert M. Fung,et al.  Applying Bayesian networks to information retrieval , 1995, CACM.

[93]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification and Regression , 1995, NIPS.

[94]  Kamal A. Ali,et al.  On the Link between Error Correlation and Error Reduction in Decision Tree Ensembles , 1995 .

[95]  D. Edwards Introduction to graphical modelling , 1995 .

[96]  Russell G. Almond Graphical belief modeling , 1995 .

[97]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[98]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[99]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[100]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[101]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[102]  Yoav Freund,et al.  Game theory, on-line prediction and boosting , 1996, COLT '96.

[103]  Harry Wechsler,et al.  From Statistics to Neural Networks: Theory and Pattern Recognition Applications , 1996 .

[104]  Peter L. Bartlett,et al.  The VC Dimension and Pseudodimension of Two-Layer Neural Networks with Discrete Inputs , 1996, Neural Computation.

[105]  Harris Drucker Fast Decision Tree Ensembles for Optical Character Recognition , 1996 .

[106]  Manfred K. Warmuth,et al.  How to use expert advice , 1997, JACM.

[107]  Nicolaos B. Karayiannis,et al.  Growing radial basis neural networks: merging supervised and unsupervised learning with network growth techniques , 1997, IEEE Trans. Neural Networks.

[108]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[109]  Eduardo D. Sontag,et al.  Neural Networks with Quadratic VC Dimension , 1995, J. Comput. Syst. Sci..

[110]  Jürgen Schmidhuber,et al.  Flat Minima , 1997, Neural Computation.

[111]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .