Constructing Compact Takagi-Sugeno Rule Systems: Identification of Complex Interactions in Epidemiological Data

The Takagi-Sugeno (TS) fuzzy rule system is a widely used data mining technique, and is of particular use in the identification of non-linear interactions between variables. However the number of rules increases dramatically when applied to high dimensional data sets (the curse of dimensionality). Few robust methods are available to identify important rules while removing redundant ones, and this results in limited applicability in fields such as epidemiology or bioinformatics where the interaction of many variables must be considered. Here, we develop a new parsimonious TS rule system. We propose three statistics: R, L, and ω-values, to rank the importance of each TS rule, and a forward selection procedure to construct a final model. We use our method to predict how key components of childhood deprivation combine to influence educational achievement outcome. We show that a parsimonious TS model can be constructed, based on a small subset of rules, that provides an accurate description of the relationship between deprivation indices and educational outcomes. The selected rules shed light on the synergistic relationships between the variables, and reveal that the effect of targeting specific domains of deprivation is crucially dependent on the state of the other domains. Policy decisions need to incorporate these interactions, and deprivation indices should not be considered in isolation. The TS rule system provides a basis for such decision making, and has wide applicability for the identification of non-linear interactions in complex biomedical data.

[1]  Vicenç Torra,et al.  A review of the construction of hierarchical fuzzy systems , 2002, Int. J. Intell. Syst..

[2]  M. Stafoggia,et al.  Associations of area based deprivation status and individual educational attainment with incidence, treatment, and prognosis of first coronary event in Rome, Italy , 2005, Journal of Epidemiology and Community Health.

[3]  Kanta Tachibana,et al.  A structure identification method of submodels for hierarchical fuzzy modeling using the multiple objective genetic algorithm , 2002, Int. J. Intell. Syst..

[4]  L. Trupin,et al.  Area-level socio-economic status and health status among adults with asthma and rhinitis , 2006, European Respiratory Journal.

[5]  Nikhil R. Pal,et al.  Some novel classifiers designed using prototypes extracted by a new scheme based on self-organizing feature map , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Kiriakos Kiriakidis Takagi-Sugeno fuzzy modeling and control: unmodeled dynamics and robustness , 1999, Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304).

[7]  Lotfi A. Zadeh,et al.  The Concepts of a Linguistic Variable and its Application to Approximate Reasoning , 1975 .

[8]  James C. Bezdek,et al.  Two soft relatives of learning vector quantization , 1995, Neural Networks.

[9]  Maria J. Fuente,et al.  Checking orthogonal transformations and genetic algorithms for selection of fuzzy rules based on interpretability-accuracy concepts , 2011, 2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011).

[10]  Virginia Klema,et al.  Rosetak Document 4: Rank Degeneracies and Least Square Problems , 1977 .

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  Sandra E. Black,et al.  From the Cradle to the Labor Market? The Effect of Birth Weight on Adult Outcomes , 2005, SSRN Electronic Journal.

[13]  Chunshien Li Computational Issue of Fuzzy Rule-based System , 2006 .

[14]  Partha Pratim Kanjilal,et al.  On the application of orthogonal transformation for the design and analysis of feedforward networks , 1995, IEEE Trans. Neural Networks.

[15]  John Q. Gan,et al.  Constructing L2-SVM-Based Fuzzy Classifiers in High-Dimensional Space With Automatic Model Selection and Fuzzy Rule Ranking , 2007, IEEE Transactions on Fuzzy Systems.

[16]  A. Garrett,et al.  Ockham’s Razor , 1991 .

[17]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[18]  John Q. Gan,et al.  Constructing accurate and parsimonious fuzzy models with distinguishable fuzzy sets based on an entropy measure , 2006, Fuzzy Sets Syst..

[19]  John Q. Gan,et al.  Extracting Takagi-Sugeno Fuzzy Rules with Interpretable Submodels via Regularization of Linguistic Modifiers , 2009, IEEE Transactions on Knowledge and Data Engineering.

[20]  L D Lutter,et al.  Ockham's Razor , 1999, Foot & ankle international.

[21]  Zeng-qi Sun,et al.  Universal approximation of TS fuzzy systems constructed dynamically-MISO cases , 2010, Proceedings of the 2010 American Control Conference.

[22]  M. Thun,et al.  Individual- and area-level socioeconomic status variables as predictors of mortality in a cohort of 179,383 persons. , 2004, American journal of epidemiology.

[23]  Christopher J. Harris,et al.  Fuzzy local linearization and local basis function expansion in nonlinear system modeling , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[24]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[25]  P. Donnan,et al.  Social deprivation increases cardiac hospitalisations in chronic heart failure independent of disease severity and diuretic non-adherence , 2000, Heart.

[26]  Eva Malacova,et al.  Association of birth outcomes and maternal, school, and neighborhood characteristics with subsequent numeracy achievement. , 2008, American journal of epidemiology.

[27]  L. A. ZADEH,et al.  The concept of a linguistic variable and its application to approximate reasoning - I , 1975, Inf. Sci..

[28]  G. Stewart,et al.  Rank degeneracy and least squares problems , 1976 .

[29]  Li-Xin Wang,et al.  Universal approximation by hierarchical fuzzy systems , 1998, Fuzzy Sets Syst..

[30]  L G Branch,et al.  Educational status and active life expectancy among older blacks and whites. , 1993, The New England journal of medicine.

[31]  Robert Babuska,et al.  Rule base reduction: some comments on the use of orthogonal transforms , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[32]  R. Lyons,et al.  The SAIL Databank: building a national architecture for e-health research and evaluation , 2009, BMC health services research.

[33]  Shyam Visweswaran,et al.  Learning patient-specific predictive models from clinical data , 2010, J. Biomed. Informatics.

[34]  J. Gentle Numerical Linear Algebra for Applications in Statistics , 1998 .

[35]  Kerina H. Jones,et al.  The SAIL databank: linking multiple health and social care datasets , 2009, BMC Medical Informatics Decis. Mak..

[36]  Serge Guillaume,et al.  Designing fuzzy inference systems from data: An interpretability-oriented review , 2001, IEEE Trans. Fuzzy Syst..

[37]  E. H. Mandami Application of Fuzzy Logic to Approximate Reasoning using Linguistic Synthesis , 1977 .

[38]  Stationery Office,et al.  Independent Inquiry into Inequalities in Health: Report , 1999 .

[39]  J. Blanden,et al.  Family income and educational attainment: a review of approaches and evidence for Britain , 2004 .

[40]  Q GanJohn,et al.  Low-level interpretability and high-level interpretability , 2008 .

[41]  Xiao-Jun Zeng,et al.  Intermediate Variable Normalization for Gradient Descent Learning for Hierarchical Fuzzy System , 2009, IEEE Transactions on Fuzzy Systems.

[42]  Sir Donald Acheson Independent inquiry into inequalities in health report , 1998 .

[43]  G. Watt,et al.  Individual social class, area-based deprivation, cardiovascular disease risk factors, and mortality: the Renfrew and Paisley Study. , 1998, Journal of epidemiology and community health.

[44]  John Q. Gan,et al.  Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling , 2008, Fuzzy Sets Syst..