On the Sensitivity of the Probability of Error Rule for Feature Selection

The low sensitivity of the probability of error rule (Pe rule) for feature selection is demonstrated and discussed. It is shown that under certain conditions features with significantly different discrimination power are considered as equivalent by the Pe rule. The main reason for this phenomenon lies in the fact that, directly, the Pe rule depends only on the most probable class and that, under the stated condition, the prior most probable class remains the posterior most probable class regardless of the result for the observed feature. A rule for breaking ties is suggested to refine the feature ordering induced by the Pe rule. By this tie-breaking rule, when two features have the same value for the expected probability of error, the feature with the higher variance for the probability of error is preferred.