Conditions for Occam's Razor Applicability and Noise Elimination

The Occam's razor principle suggests that among all the correct hypotheses, the simplest hypothesis is the one which best captures the structure of the problem domain and has the highest prediction accuracy when classifying new instances. This principle is implicitly used also for dealing with noise, in order to avoid overfitting a noisy training set by rule truncation or by pruning of decision trees. This work gives a theoretical framework for the applicability of Occam's razor, developed into a procedure for eliminating noise from a training set. The results of empirical evaluation show the usefulness of the presented approach to noise elimination.

[1]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[2]  Paul M. B. Vitányi,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 1993, Graduate Texts in Computer Science.

[3]  Nada Lavrac,et al.  Noise Detection and Elimination Applied to Noise Handling in a KRK Chess Endgame , 1996, Inductive Logic Programming Workshop.

[4]  Ron Kohavi,et al.  Bias Plus Variance Decomposition for Zero-One Loss Functions , 1996, ICML.

[5]  I. Bratko,et al.  Information-based evaluation criterion for classifier's performance , 2004, Machine Learning.

[6]  William I. Gasarch,et al.  Book Review: An introduction to Kolmogorov Complexity and its Applications Second Edition, 1997 by Ming Li and Paul Vitanyi (Springer (Graduate Text Series)) , 1997, SIGACT News.

[7]  Saso Dzeroski,et al.  Inductive Logic Programming: Techniques and Applications , 1993 .

[8]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[9]  Geoffrey I. Webb Further Experimental Evidence against the Utility of Occam's Razor , 1996, J. Artif. Intell. Res..

[10]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[11]  W. Spears,et al.  For Every Generalization Action, Is There Really an Equal and Opposite Reaction? , 1995, ICML.

[12]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[13]  Dragan Gamberger,et al.  A Minimization Approach to Propositional Inductive Learning , 1995, ECML.

[14]  Saso Dzeroski,et al.  Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois , 1996, ALT.

[15]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.