Enhancing electronic nose performance by sensor selection using a new integer-based genetic algorithm approach

Feature selection techniques can be used in order to find an optimal subset of sensors from an array of high dimensionality by eliminating redundant or irrelevant ones. By optimising the array size, the overall system performance can potentially be increased by maximising the information content and hence increasing the predictive accuracy. However, searching high dimensional space is problematic in the very high number of permutations. A novel search method procedure, V-integer genes genetic algorithms (GA), is introduced and compared with other search methods such as sequential forward or backward searches (SFS or SBS) and X-binary genes GAs. Results are presented for a data-set consisting of 180 samples from some eye bacteria screening tests that were collected using an electronic nose (EN) with 32 sensing elements. For the data-set used in this work, SFS achieved over 89% correct classification by selecting just three features, whereas SBS needed at least five features to reach the same level. With 32-binary genes GAs, the dimensionality is reduced by 50–60% and the classification rates are on average 91%. Considering eight, six or four features, the optimal subsets returned by the V-integer genes GA selections have dimensionality reduced by over 80% and on average achieve around 90% correct classification. Two selections, of six and three features, are considered for further pattern recognition (PARC) analysis using different classifiers. These results show that the newly developed V-integer genes GA approach is an accurate, and importantly, a very fast search method when compared to some other feature selection techniques.