Stability Assessment of Feature Selection Algorithms on Homogeneous Datasets: A Study for Sensor Array Optimization Problem

A feature selection algorithm (FSA) is used to eliminate redundant and irrelevant features. Obviously, it can reduce dimensionality as well as the complexity of the original problem. Furthermore, the stability of FSA output becomes a major issue in real-world applications. Stability refers to the consistency of its feature preference related to the perturbation of data samples. In sensor array optimization, an FSA is used to find the best sensor combination in a sensor array. Typically, the main objectives of sensor array optimization are reducing data dimensions, electrical consumption, production cost, computational and traffic overhead, etc. Furthermore, the stable outputs of FSA in several observations are necessary to make a firm conclusion of selected sensors. The contribution of this research is to investigate the stability of FSAs in twelve homogeneous datasets in relation to the sensor array optimization problem. In this study, the stability of seventeen filter-based FSAs is compared across twelve homogeneous datasets. These datasets are generated from the electronic nose (e-nose) used to monitor twelve types of beef cuts. In this case, gas sensor array must have good generalization to differentiate all beef types. The experimental results show that a single FSA cannot guarantee stable sensors recommendation in sensor array optimization. Thus, it becomes a caution to researchers and practitioners to use a proper approach when performing sensor array optimization.

[1]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[2]  Changsheng Xie,et al.  A sensor array optimization method for electronic noses with sub-arrays , 2009 .

[3]  Blaise Hanczar,et al.  Analysis of feature selection stability on high dimension and small sample data , 2014, Comput. Stat. Data Anal..

[4]  Mahdi Ghasemi-Varnamkhasti,et al.  Selection of an optimized metal oxide semiconductor sensor (MOS) array for freshness characterization of strawberry in polymer packages using response surface method (RSM) , 2019, Postharvest Biology and Technology.

[5]  Kuwat Triyana,et al.  Optimized back-propagation combined with radial basic neural network for improving performance of the electronic nose: Case study on the fermentation process of tempeh , 2016 .

[6]  Riyanarto Sarno,et al.  Electronic nose for classifying beef and pork using Naïve Bayes , 2017, 2017 International Seminar on Sensors, Instrumentation, Measurement and Metrology (ISSIMM).

[7]  Riyanarto Sarno,et al.  Gas concentration analysis of resistive gas sensor array , 2016, 2016 International Symposium on Electronics and Smart Devices (ISESD).

[8]  Aleks Jakulin Machine Learning Based on Attribute Interactions , 2005 .

[9]  Riyanarto Sarno,et al.  Detection of diabetes from gas analysis of human breath using e-Nose , 2017, 2017 11th International Conference on Information & Communication Technology and System (ICTS).

[10]  Maciej Wielgosz,et al.  Evaluation and Implementation of n-Gram-Based Algorithm for Fast Text Comparison , 2017, Comput. Informatics.

[11]  Riyanarto Sarno,et al.  Noise filtering framework for electronic nose signals: An application for beef quality monitoring , 2019, Comput. Electron. Agric..

[12]  Pablo A. Estévez,et al.  A review of feature selection methods based on mutual information , 2013, Neural Computing and Applications.

[13]  David Causeur,et al.  Stability of feature selection in classification issues for high-dimensional correlated data , 2015, Statistics and Computing.

[14]  Riyanarto Sarno,et al.  Mobile Electronic Nose Architecture for Beef Quality Detection Based on Internet of Things Technology , 2015 .

[15]  P. Cunningham,et al.  Solutions to Instability Problems with Sequential Wrapper-based Approaches to Feature Selection , 2002 .

[16]  Kuwat Triyana,et al.  PROTOTYPE OF ELECTRONIC NOSE BASED ON GAS SENSORS ARRAY AND BACK PROPAGATION NEURAL NETWORK FOR TEA CLASSIFICATION , 2007 .

[17]  Hao Wu,et al.  Sensor array optimization and discrimination of apple juices according to variety by an electronic nose , 2017 .

[18]  Jun Wang,et al.  Optimization of sensor array and detection of stored duration of wheat by electronic nose , 2007 .

[19]  Riyanarto Sarno,et al.  Sensor Array Optimization for Mobile Electronic Nose: Wavelet Transform and Filter Based Feature Selection Approach , 2016 .

[20]  Kuwat Triyana,et al.  Development of Electronic Nose with Low-Cost Dynamic Headspace for Classifying Vegetable Oils and Animal Fats , 2015 .

[21]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[22]  Shu Fan,et al.  A novel sensor array and classifier optimization method of electronic nose based on enhanced quantum-behaved particle swarm optimization , 2014 .

[23]  Lei Zhang,et al.  A novel sensor selection using pattern recognition in electronic nose , 2014 .

[24]  Jun Wang,et al.  An optimization of the MOS electronic nose sensor array for the detection of Chinese pecan quality , 2017 .

[25]  Taghi M. Khoshgoftaar,et al.  On the Stability of Feature Selection Methods in Software Quality Prediction: An Empirical Investigation , 2015, Int. J. Softw. Eng. Knowl. Eng..

[26]  Riyanarto Sarno,et al.  Development of mobile electronic nose for beef quality monitoring , 2017 .

[27]  Colas Schretter,et al.  Information-Theoretic Feature Selection in Microarray Data Using Variable Complementarity , 2008, IEEE Journal of Selected Topics in Signal Processing.

[28]  Gavin Brown,et al.  Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection , 2012, J. Mach. Learn. Res..

[29]  Tingwen Huang,et al.  Hybrid feature matrix construction and feature selection optimization-based multi-objective QPSO for electronic nose in wound infection detection , 2016 .

[30]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Gavin Brown,et al.  Measuring the Stability of Feature Selection , 2016, ECML/PKDD.

[32]  Andrey Somov,et al.  Optimization of power consumption for gas sensor nodes: A survey , 2015 .

[33]  Jazi Eko Istiyanto,et al.  Classification of Indonesia black teas based on quality by using electronic nose and principal component analysis , 2016 .

[34]  R. Dhanalakshmi,et al.  Stability of feature selection algorithm: A review , 2019, J. King Saud Univ. Comput. Inf. Sci..

[35]  Riyanarto Sarno,et al.  Recent development in electronic nose data processing for beef quality assessment , 2019 .

[36]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[37]  Gavin Brown,et al.  Measuring the Stability of Feature Selection with Applications to Ensemble Methods , 2015, MCS.

[38]  Przemyslaw M. Szecowka,et al.  On reliability of neural network sensitivity analysis applied for sensor array optimization , 2011 .

[39]  B. Tudu,et al.  Optimization of Sensor Array in Electronic Nose: A Rough Set-Based Approach , 2011, IEEE Sensors Journal.

[40]  D. Wijaya,et al.  Information Quality Ratio as a novel metric for mother wavelet selection , 2017 .

[41]  Albert Y. Zomaya,et al.  Stability of feature selection algorithms and ensemble feature selection methods in bioinformatics , 2013 .

[42]  Gang Li,et al.  An Effective Gas Sensor Array Optimization Method Based on Random Forest* , 2018, 2018 IEEE SENSORS.

[43]  Dahua Lin,et al.  Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion , 2006, ECCV.

[44]  S. Ghorai,et al.  Optimization of sensor array in electronic nose by combinational feature selection method , 2012, 2012 Sixth International Conference on Sensing Technology (ICST).

[45]  John E. Moody,et al.  Data Visualization and Feature Selection: New Algorithms for Nongaussian Data , 1999, NIPS.

[46]  S. Wright THE INTERPRETATION OF POPULATION STRUCTURE BY F‐STATISTICS WITH SPECIAL REGARD TO SYSTEMS OF MATING , 1965 .

[47]  Zhe Xu,et al.  Multi-objective optimization of sensor array using genetic algorithm , 2011 .

[48]  M. Nurjuliana,et al.  Analysis of Lard’s Aroma by an Electronic Nose for Rapid Halal Authentication , 2011 .

[49]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[50]  Melanie Hilario,et al.  Knowledge and Information Systems , 2007 .

[51]  Feiping Nie,et al.  Trace Ratio Criterion for Feature Selection , 2008, AAAI.

[52]  Zhe Xu,et al.  Integrated sensor array optimization with statistical evaluation , 2010 .

[53]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[54]  David D. Lewis,et al.  Feature Selection and Feature Extraction for Text Categorization , 1992, HLT.

[55]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..