Maximizing expected gain in supervised discrete Bayesian classification when fusing binary valued features

In this paper, previously reported work is extended for fusing binary valued features. In general, when mining discrete data to train supervised discrete Bayesian classifiers, it is often of interest to determine the best threshold setting for maximizing performance. In this work, we utilize a discrete Bayesian classification model, a gain function, to determine the best threshold setting for a given number of binary valued training data under each class. Results are demonstrated for simulated data by plotting the expected gain versus threshold settings for different numbers of training data. In general, it is shown that the expected gain reaches a maximum at a certain threshold. Further, this maximum point varies with the overall quantization of the data. Additional results are also shown for a different gain function on the decision variable, that are used to extend previously reported results.

[1]  Raphail E. Krichevsky,et al.  The performance of universal encoding , 1981, IEEE Trans. Inf. Theory.

[2]  Michael Gutman,et al.  Asymptotically optimal classification for multiple tests with empirically observed statistics , 1989, IEEE Trans. Inf. Theory.

[3]  Neri Merhav,et al.  A Bayesian classification approach with application to speech recognition , 1991, IEEE Trans. Signal Process..

[4]  Aaron D. Wyner,et al.  Classification with finite memory , 1996, IEEE Trans. Inf. Theory.

[5]  Peter Willett,et al.  Performance considerations for a combined information classification test using Dirichlet priors , 1999, IEEE Trans. Signal Process..

[6]  Peter Cheeseman,et al.  Bayesian classification theory , 1991 .

[7]  Keinosuke Fukunaga,et al.  Statistical Pattern Recognition , 1993, Handbook of Pattern Recognition and Computer Vision.

[8]  Jacob Ziv,et al.  On classification with empirically observed statistics and universal data compression , 1988, IEEE Trans. Inf. Theory.

[9]  Peter Willett,et al.  Estimating the threshold for maximizing expected gain in supervised discrete Bayesian classification , 2009, Defense + Commercial Sensing.

[10]  L. Lorne Campbell Averaging entropy , 1995, IEEE Trans. Inf. Theory.

[11]  S. Kulkarni,et al.  A general classification rule for probability measures , 1995 .

[12]  Peter Willett,et al.  Bayesian classification and feature reduction using uniform Dirichlet priors , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[13]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.