Advanced Statistical Metrics for Gas Identification System With Quantification Feedback

The pattern recognition problem for real-life applications of gas identification is challenging due to the limited amount of data existing and the sequential variability of the mechanism mostly caused by drift and the real-time detection. These problems are commonly caused by the slow response of most of the gas sensors. In this paper, a novel gas identification approach based on the cluster-k-nearest neighbor (C-k-NN) is introduced. The effectiveness of this approach has been successfully demonstrated on the experimental data set obtained from array of gas sensors. Our classification takes advantages of both the k-NN, which is highly accurate, and the k-means cluster, which is able to reduce the classification time. In order to increase the accuracy rate, a new feature selection method is proposed. The selection of features is based on their ability to separate and distinguish between different classes. Advanced statistical metrics are introduced to quantify the classification contribution of each feature. Mostly, classifiers are suffering from misclassification detection; new statistical metrics are introduced to estimate the exactness of the classifier response, i.e., to detect the misclassification. To enhance the classification performances for gas identification, a new tree classification design is introduced, named tree C-k-NN. In order to assess the technique, experiments were conducted on six different gases. Accuracy rate of 98.7% has been obtained with the C-k-NN and 100% with the tree C-k-NN. The performance of this approach is also validated using three publicly available data sets.

[1]  Amine Bermak,et al.  Gaussian process for nonstationary time series prediction , 2004, Comput. Stat. Data Anal..

[2]  Thomas P. Yunck,et al.  A Technique to Identify Nearest Neighbors , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[6]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[8]  A. Bermak,et al.  On the use of the transient information for gas identification using microelectronic gas sensor , 2004, 2004 IEEE Region 10 Conference TENCON 2004..

[9]  Wei Li Modified K-Means Clustering Algorithm , 2008, 2008 Congress on Image and Signal Processing.

[10]  Amine Bermak,et al.  Gas identification using density models , 2005, Pattern Recognit. Lett..

[11]  Walmir M. Caminhas,et al.  Multivariable Gaussian Evolving Fuzzy Modeling System , 2011, IEEE Transactions on Fuzzy Systems.

[12]  Krzysztof Jajuga,et al.  Fuzzy clustering with squared Minkowski distances , 2001, Fuzzy Sets Syst..

[13]  Ayten Atasoy,et al.  Classification of n-butanol concentrations with k-NN algorithm and ANN in electronic nose , 2011, 2011 International Symposium on Innovations in Intelligent Systems and Applications.

[14]  Haifeng Ge,et al.  Identification of gas mixtures by a distributed support vector machine network and wavelet decomposition from temperature modulated semiconductor gas sensor , 2006 .

[15]  Amine Bermak,et al.  Bayesian learning using Gaussian process for gas identification , 2006, IEEE Transactions on Instrumentation and Measurement.

[16]  A. Bermak,et al.  A comparative study of density models for gas identification using microelectronic gas sensor , 2003, Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No.03EX795).

[17]  Amine Bermak,et al.  Pattern Recognition Techniques for Odor Discrimination in Gas Sensor Array , 2005 .

[18]  Akihiro Yamamoto,et al.  A Fast and Flexible Clustering Algorithm Using Binary Discretization , 2011, 2011 IEEE 11th International Conference on Data Mining.

[19]  S. Rose-Pehrsson,et al.  A comparison study of chemical sensor array pattern recognition algorithms , 1999 .

[20]  AhmadAmir,et al.  Cluster center initialization algorithm for K-means clustering , 2004 .

[21]  Christian Böhm,et al.  Querying Objects Modeled by Arbitrary Probability Distributions , 2007, SSTD.

[22]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[23]  Giles M. Foody,et al.  Feature Selection for Classification of Hyperspectral Data by SVM , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[24]  John A. Richards,et al.  Cluster-space representation for hyperspectral data classification , 2002, IEEE Trans. Geosci. Remote. Sens..

[25]  Shehroz S. Khan,et al.  Cluster center initialization algorithm for K-means clustering , 2004, Pattern Recognit. Lett..

[26]  Le Thi Hoai An,et al.  Combined feature selection and classification using DCA , 2008, 2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies.

[27]  Siwei Luo,et al.  Entropy based soft K-means clustering , 2008, 2008 IEEE International Conference on Granular Computing.

[28]  Samir Brahim Belhaouari Fast and Accuracy Control Chart Pattern Recognition using a New cluster-k-Nearest Neighbor , 2009 .

[29]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[30]  A. Bermak,et al.  Fast and robust gas identification system using an integrated gas sensor technology and Gaussian mixture models , 2005, IEEE Sensors Journal.

[31]  Wesam M. Ashour,et al.  Initializing K-Means Clustering Algorithm using Statistical Information , 2011 .

[32]  Yelena Yesha,et al.  Automated clustering-based workload characterization , 1996 .

[33]  Dimitrios Charalampidis,et al.  A modified k-means algorithm for circular invariant clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[35]  Michael J. Laszlo,et al.  A genetic algorithm using hyper-quadtrees for low-dimensional k-means clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Wesam M. Ashour,et al.  An Initialization Method for the K-means Algorithm using RNN and Coupling Degree , 2011 .

[37]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[38]  Peng Zhang,et al.  Dynamic Learning of SMLR for Feature Selection and Classification of Hyperspectral Data , 2008, IEEE Geoscience and Remote Sensing Letters.