Why Discretization Works for Naive Bayesian Classifiers

This paper explains why well-known dis-cretization methods, such as entropy-based and ten-bin, work well for naive Bayesian classiiers with continuous variables, regardless of their complexities. These methods usually assume that discretized variables have Dirichlet priors. Since perfect aggrega-tion holds for Dirichlets, we can show that, generally, a wide variety of discretization methods can perform well with insigniicant diierence. We identify situations where dis-cretization may cause performance degradation and show that they are unlikely to happen for well-known methods. We empirically test our explanation with synthesized and real data sets and obtain connrming results. Our analysis leads to a lazy discretiza-tion method that can simplify the training for naive Bayes. This new method can perform as well as well-known methods in our experiment.