Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison

Accurate weather prediction is crucial for many activities, including Naval operations. Researchers within the meteorological division of the Naval Research Laboratory have developed and fielded several expert systems for problems such as fog and turbulence forecasting, and tropical storm movement. They are currently developing an automated system for satellite image interpretation, part of which involves cloud classification. Their cloud classification database contains 204 high-level features, but contains only a few thousand instances. The predictive accuracy of classifiers can be improved on this task by employing a feature selection algorithm. We explain why non-parametric case-based classifiers are excellent choices for use in feature selection algorithms. We then describe a set of such algorithms that use case-based classifiers, empirically compare them, and introduce novel extensions of backward sequential selection that allows it to scale to this task. Several of the approaches we tested located feature subsets that attain significantly higher accuracies than those found in previously published research, and some did so with fewer features.

[1]  Rich Caruana,et al.  Greedy Attribute Selection , 1994, ICML.

[2]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[3]  Richard L. Bankert,et al.  Cloud Classification of AVHRR Imagery in Maritime Regions Using a Probabilistic Neural Network , 1994 .

[4]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[5]  Paul M. Tag,et al.  Toward Automated Interpretation of Satellite Imagery for Navy Shipboard Applications , 1992 .

[6]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[7]  Ronald M. Welch,et al.  Polar Cloud and Surface Classification Using AVHRR Imagery: An Intercomparison of Methods , 1992 .

[8]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[9]  Kevin D. Ashley,et al.  Waiting on Weighting: A Symbolic Least Commitment Approach , 1988, AAAI.

[10]  David W. Aha,et al.  Generalizing from Case studies: A Case Study , 1992, ML.

[11]  Kenneth DeJong,et al.  Robust feature selection algorithms , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[12]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[15]  Kevin D. Ashley,et al.  Waiting on Weighting: olic Least Commitment Approach' , 1988 .

[16]  Anthony N. Mucciardi,et al.  A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties , 1971, IEEE Transactions on Computers.

[17]  Thomas G. Dietterich,et al.  An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms , 1995, Machine Learning.

[18]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[19]  Jan M. Van Campenhout,et al.  On the Possible Orderings in the Measurement Selection Problem , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[21]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.