Determining Hall of Fame Status for Major League Baseball Using an Artificial Neural Network

Election into Major League Baseball's (MLB) National Hall of Fame (HOF) often sparks debate among the fans, media, players, managers, and other members in the baseball community. Since the HOF members must be elected by a committee of baseball sportswriters and other entities, the prediction of a player's inclusion in the HOF is not trivial to model. There has been a lack of research in predicting HOF status based on a player's career statistics. Many models that were found in a literature search use linear models, which do not provide robust solutions for classification prediction in complex non-linear datasets. The multitude of possible combinations of career statistics is better suited for a non-linear model, like artificial neural networks (ANN). The objective of this research is to create an ANN model which can be used to predict HOF status for MLB players based on their career offensive and defensive statistics as well as the number of career end of the season awards. This research is limited to investigating players who are not pitchers. Another objective of this report is to give the audience of this particular journal an overview of ANNs.

[1]  Uygar Özesmi,et al.  An artificial neural network approach to spatial habitat modelling with interspecific interaction , 1999 .

[2]  Creating and Monitoring Meaningful Individual Rugby Ratings , 2003 .

[3]  D. Findlay,et al.  Voting Behavior, Discrimination and the National Baseball Hall of Fame , 1997 .

[5]  Christopher M. Clapp,et al.  How Long a Honeymoon? The Effect of New Stadiums on Attendance in Major League Baseball , 2005 .

[6]  David A. Cohen EA-lect: an evolutionary algorithm for constructing logical rules to predict election into Cooperstown , 2004, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753).

[7]  Robert J. Marks,et al.  Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks , 1999 .

[8]  José Luis Palacios,et al.  A Markov Chain Approach to Baseball , 1997, Oper. Res..

[9]  Patrick Kam Cheung Wong Developing an Intelligent Assistant for Table Tennis Umpires , 2007, First Asia International Conference on Modelling & Simulation (AMS'07).

[10]  Laurene V. Fausett,et al.  Fundamentals Of Neural Networks , 1993 .

[11]  K. G. Quinn,et al.  Growing and Moving the Game: Effects of MLB Expansion and Team Relocation 1950-2004 , 2007 .

[12]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[13]  A. Terry Bahill,et al.  Two methods for recommending bat weights , 2006, Annals of Biomedical Engineering.

[14]  Yong-Quan Zhou,et al.  Application of Functional Network to Solving Classification Problems , 2005, IEC.

[15]  M. C. Purucker,et al.  Neural network quarterbacking , 1996 .

[16]  Rick L. Wilson Ranking College Football Teams: A Neural Network Approach , 1995 .

[17]  Jürgen Schürmann,et al.  Pattern classification , 2008 .

[18]  Julian D. Olden,et al.  Illuminating the “black box”: a randomization approach for understanding variable contributions in artificial neural networks , 2002 .

[19]  S. J. Press,et al.  Choosing between Logistic Regression and Discriminant Analysis , 1978 .

[20]  Voting for the Baseball Hall of Fame: The Effect of Race on Election Date , 2003 .

[21]  Jeffrey W. Ohlmann,et al.  A Player Selection Heuristic for a Sports League Draft , 2007 .

[22]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[23]  Stephen E. Fienberg,et al.  The analysis of cross-classified categorical data , 1980 .

[24]  Jose C. Principe,et al.  Neural and adaptive systems : fundamentals through simulations , 2000 .

[25]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[26]  Michael D. Robinson,et al.  Baseball Hall of Fame voting : A test of the customer discrimination hypothesis , 1999 .

[27]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[28]  Michael C. Mozer,et al.  Neural net architectures for temporal sequence processing , 2007 .

[29]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[30]  Kamel Mohamed Faraoun,et al.  Neural Networks Learning Improvement using the K-Means Clustering Algorithm to Detect Network Intrusions , 2007 .

[31]  Antoine Geissbühler,et al.  Learning from imbalanced data in surveillance of nosocomial infection , 2006, Artif. Intell. Medicine.

[32]  Akhil Kumar,et al.  An empirical comparison of neural network and logistic regression models , 1995 .