The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon

In this paper, we study the use of unlabeled sam- ples in reducing the problem of small training sample size that can severely affect the recognition rate of classifiers when the dimensionality of the multispectral data is high. We show that by using additional unlabeled samples that are available at no extra cost, the performance may be improved, and therefore the Hughes phenomenon can be mitigated. Furthermore, by ex- periments, we show that by using additional unlabeled samples more representative estimates can be obtained. We also pro- pose a semiparametric method for incorporating the training (Le., labeled) and unlabeled samples simultaneously into the parameter estimation process.

[1]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[2]  Keinosuke Fukunaga,et al.  Nonparametric Bayes error estimation using unclassified samples , 1972, IEEE Trans. Inf. Theory.

[3]  David A. Landgrebe,et al.  Variance comparisons for unbiased estimators of probability of correct classification (Corresp.) , 1976, IEEE Trans. Inf. Theory.

[4]  J. Schmee Matrices with Applications in Statistics , 1982 .

[5]  G. Saridis Parameter estimation: Principles and problems , 1983, Proceedings of the IEEE.

[6]  F. Graybill,et al.  Matrices with Applications in Statistics. , 1984 .

[7]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[8]  P. Hall,et al.  The Use of Uncategorized Data to Improve the Performance of a Nonparametric Estimator of a Mixture Density , 1985 .

[9]  V. Salomonson,et al.  MODIS: advanced facility instrument for studies of the Earth as a system , 1989 .

[10]  Behzad M. Shahshahani,et al.  Using Partially Labeled Data For Normal Mixture Identification With Application To Class Definition , 1992, [Proceedings] IGARSS '92 International Geoscience and Remote Sensing Symposium.

[11]  Wallace M. Porter,et al.  The airborne visible/infrared imaging spectrometer (AVIRIS) , 1993 .

[12]  David A. Landgrebe,et al.  Feature Extraction Based on Decision Boundaries , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  David A. Landgrebe,et al.  Analyzing high-dimensional multispectral data , 1993, IEEE Trans. Geosci. Remote. Sens..