변수간 중복성 제거를 고려한 개선된 유효 범위 기반 변수선택법

Feature selection plays a key role in a classification problem with high-dimensional data. Main idea of feature selection is to reduce dimensionality of input space while preserving classification accuracy. Effective range is an efficient way to measure importance of features in classification problem as seen in a recent research named improved feature selection based on effective range (IFSER). However, IFSER only considers the overlapping area and the including area of effective ranges of each class for every feature; it fails to investigate how separated the effective ranges are. To overcome the limitation, we suggest a concept of extent of separation. In addition, we minimize redundancy among features by eliminating features having strong correlation with others. Finally, we show experimental results to compare our proposed method with several benchmarking methods such as ERGS (Effective Range based Gene Selection), IFSER, PCC (Pearson Correlation Coefficient) based feature selection, Chi-square feature selection and ReliefF.