Principal component analysis approach in selecting type-1 and type-2 fuzzy membership functions for high-dimensional data

With increased interest in learning from data, algorithms that manipulate datasets containing hundreds of features have become popular in various fields such as medicine, image processing, geolocation, biochemistry, and computational linguistics. Since a number of these applications exploit the power of fuzzy sets in representing uncertainties, it may be considered essential to describe a method for selecting the most suitable fuzzy membership function to represent a high-dimensional dataset. In this paper, we propose such a method, which is based on dimensionality reduction using the Principal Component Analysis (PCA) technique, followed by the Wilcoxon Minimal Bin Size algorithm, which has earlier been evaluated on multidimensional datasets up to 8 dimensions. We further demonstrate our proposed method using two real datasets consisting of 281 and 500 features, respectively.

[1]  N. N. Karnik,et al.  Introduction to type-2 fuzzy logic systems , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[2]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[3]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[4]  Donald A. Jackson STOPPING RULES IN PRINCIPAL COMPONENTS ANALYSIS: A COMPARISON OF HEURISTICAL AND STATISTICAL APPROACHES' , 1993 .

[5]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[6]  Robert Ivor John,et al.  Neuro-fuzzy clustering of radiographic tibia image data using type 2 fuzzy sets , 2000, Inf. Sci..

[7]  Dongrui Wu,et al.  Type-2 FLS Modeling Capability Analysis , 2005, The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05..

[8]  Oscar Castillo,et al.  Interval type-2 fuzzy logic for dynamic parameter adaptation in the Harmony search algorithm , 2016, 2016 IEEE 8th International Conference on Intelligent Systems (IS).

[9]  Byung-In Choi,et al.  Interval type-2 fuzzy membership function generation methods for pattern recognition , 2009, Inf. Sci..

[10]  Jerry M. Mendel,et al.  Interval type-2 fuzzy logic systems , 2000, Ninth IEEE International Conference on Fuzzy Systems. FUZZ- IEEE 2000 (Cat. No.00CH37063).

[11]  R. Macarthur ON THE RELATIVE ABUNDANCE OF BIRD SPECIES. , 1957, Proceedings of the National Academy of Sciences of the United States of America.

[12]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[13]  Jerry M. Mendel,et al.  A quantitative comparison of interval type-2 and type-1 fuzzy logic systems: First results , 2010, International Conference on Fuzzy Systems.

[14]  Krisztian Buza,et al.  Feedback Prediction for Blogs , 2012, GfKl.

[15]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[16]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[17]  Aditya Gupta,et al.  Analysis of Data Generated From Multidimensional Type-1 and Type-2 Fuzzy Membership Functions , 2018, IEEE Transactions on Fuzzy Systems.

[18]  Wenyi Wang,et al.  DeMix: deconvolution for mixed cancer transcriptomes using raw measured data , 2013, Bioinform..

[19]  Roberto Souto Maior de Barros,et al.  Wilcoxon Rank Sum Test Drift Detector , 2018, Neurocomputing.

[20]  R. Rummel Applied Factor Analysis , 1970 .

[21]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[22]  Oscar Castillo,et al.  Dynamic parameter adaptation in particle swarm optimization using interval type-2 fuzzy logic , 2016, Soft Comput..

[23]  Ronald H. Randles,et al.  Wilcoxon Signed Rank Test , 2006 .

[24]  Witold Pedrycz,et al.  Interval-valued fuzzy set approach to fuzzy co-clustering for data classification , 2016, Knowl. Based Syst..

[25]  Gordon K Smyth,et al.  Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2004, Statistical applications in genetics and molecular biology.

[26]  Frank Chung-Hoon Rhee,et al.  Visual analysis and representations of type-2 fuzzy membership functions , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[27]  K. Krishnamoorthy Wilcoxon Rank-Sum Test , 2006 .

[28]  J. Mendel Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions , 2001 .

[29]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[30]  Jerry M. Mendel,et al.  Applications of Type-2 Fuzzy Logic Systems to Forecasting of Time-series , 1999, Inf. Sci..

[31]  Frank Chung-Hoon Rhee,et al.  Uncertain Fuzzy Clustering: Interval Type-2 Fuzzy Approach to $C$-Means , 2007, IEEE Transactions on Fuzzy Systems.

[32]  Hani Hagras,et al.  A hierarchical type-2 fuzzy logic control architecture for autonomous mobile robots , 2004, IEEE Transactions on Fuzzy Systems.

[33]  R. Glynn,et al.  The Wilcoxon Signed Rank Test for Paired Comparisons of Clustered Data , 2006, Biometrics.

[34]  F. Chung-Hoon Rhee Uncertain Fuzzy Clustering: Insights and Recommendations , 2007 .