Online Active Learning in Data Stream Regression Using Uncertainty Sampling Based on Evolving Generalized Fuzzy Models

In this paper, we propose three criteria for efficient sample selection in case of data stream regression problems within an online active learning context. The selection becomes important whenever the target values, which guide the update of the regressors as well as the implicit model structures, are costly or time-consuming to measure and also in case when very fast models updates are required to cope with stream mining real-time demands. Reducing the selected samples as much as possible while keeping the predictive accuracy of the models on a high level is, thus, a central challenge. This should be ideally achieved in unsupervised and single-pass manner. Our selection criteria rely on three aspects: 1) the extrapolation degree combined with the model's nonlinearity degree , which is measured in terms of a new specific homogeneity criterion among adjacent local approximators; 2) the uncertainty in model outputs, which can be measured in terms of confidence intervals using so-called adaptive local error bars — we integrate a weighted localization of an incremental noise level estimator and propose formulas for online merging of local error bars; 3) the uncertainty in model parameters, which is estimated by the so-called A-optimality criterion, which relies on the Fisher information matrix. The selection criteria are developed in combination with evolving generalized Takagi–Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), as it could be shown in previous publications that these outperform conventional evolving TS models (containing axis-parallel rules). The results based on three high-dimensional real-world streaming problems show that a model update based on only 10%–20% selected samples can still achieve similar accumulated model errors over time to the case when performing a full model update on all samples. This can be achieved with a negligible sensitivity on the size of the active learning latency buffer. Random sampling with the same percentages of samples selected, however, achieved much higher error rates. Hence, the intelligence in our sample selection concept leads to an economic balance between model accuracy and measurement as well computational costs for model updates.

[1]  Nikola Kasabov,et al.  Evolving Connectionist Systems: The Knowledge Engineering Approach , 2007 .

[2]  Weihua Li,et al.  Recursive PCA for Adaptive Process Monitoring , 1999 .

[3]  Walmir M. Caminhas,et al.  Multivariable Gaussian Evolving Fuzzy Modeling System , 2011, IEEE Transactions on Fuzzy Systems.

[4]  Michio Sugeno,et al.  Fuzzy identification of systems and its applications to modeling and control , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[5]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[6]  Edwin Lughofer,et al.  Evolving Fuzzy Systems - Methodologies, Advanced Concepts and Applications , 2011, Studies in Fuzziness and Soft Computing.

[7]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[8]  Chi-Yin Chow,et al.  Ambiguity-Based Multiclass Active Learning , 2016, IEEE Transactions on Fuzzy Systems.

[9]  Nikola K. Kasabov,et al.  DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction , 2002, IEEE Trans. Fuzzy Syst..

[10]  Mahardhika Pratama,et al.  GENEFIS: Toward an Effective Localist Network , 2014, IEEE Transactions on Fuzzy Systems.

[11]  Plamen P. Angelov,et al.  Handling drifts and shifts in on-line data streams with evolving fuzzy systems , 2011, Appl. Soft Comput..

[12]  Marimuthu Palaniswami,et al.  Fuzzy c-Means Algorithms for Very Large Data , 2012, IEEE Transactions on Fuzzy Systems.

[13]  Plamen Angelov,et al.  Evolving Takagi-Sugeno fuzzy systems from data streams (eTS+). , 2010 .

[14]  Eyke Hüllermeier,et al.  FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers , 2009, IEEE Transactions on Fuzzy Systems.

[15]  Daniel F. Leite,et al.  Evolving Granular Fuzzy Model-Based Control of Nonlinear Dynamic Systems , 2015, IEEE Transactions on Fuzzy Systems.

[16]  Stefan Schliebs,et al.  Evolving spiking neural network—a survey , 2013, Evolving Systems.

[17]  Ross D. King,et al.  Active Learning for Regression Based on Query by Committee , 2007, IDEAL.

[18]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[19]  Mahardhika Pratama,et al.  Generalized smart evolving fuzzy systems , 2015, Evol. Syst..

[20]  Stefan Jakubek,et al.  Local model network identification for online engine modelling , 2013, Inf. Sci..

[21]  Igor Skrjanc,et al.  Evolving Fuzzy-Model-Based Design of Experiments With Supervised Hierarchical Clustering , 2015, IEEE Transactions on Fuzzy Systems.

[22]  Sandro Macchietto,et al.  Model-based design of experiments for parameter precision: State of the art , 2008 .

[23]  Fernando Gomide,et al.  Evolving Possibilistic Fuzzy Modeling for Realized Volatility Forecasting With Jumps , 2017, IEEE Transactions on Fuzzy Systems.

[24]  Ronald R. Yager,et al.  A model of participatory learning , 1990, IEEE Trans. Syst. Man Cybern..

[25]  Thomas Mathew,et al.  Statistical Tolerance Regions: Theory, Applications, and Computation , 2009 .

[26]  Juan Luis Castro,et al.  Fuzzy systems with defuzzification are universal approximators , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[27]  Mlnoru Slotani Tolerance regions for a multivariate normal population , 1964 .

[28]  Igor Skrjanc,et al.  A Robust Evolving Cloud-Based Controller , 2015, Handbook of Computational Intelligence.

[29]  Edwin Lughofer,et al.  On improving performance of surface inspection systems by online active learning and flexible classifier updates , 2015, Machine Vision and Applications.

[30]  Edwin Lughofer,et al.  Human–Machine Interaction Issues in Quality Control Based on Online Image Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[31]  W. Abraham,et al.  Memory retention – the synaptic stability versus plasticity dilemma , 2005, Trends in Neurosciences.

[32]  Lihong Li,et al.  Unbiased online active learning in data streams , 2011, KDD.

[33]  Alexander Y. Sun,et al.  Model Calibration and Parameter Estimation: For Environmental and Water Resource Systems , 2015 .

[34]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[35]  Edwin Lughofer,et al.  Evolving chemometric models for predicting dynamic process parameters in viscose production. , 2012, Analytica chimica acta.

[36]  Edwin Lughofer,et al.  Evolving Fuzzy Systems - Fundamentals, Reliability, Interpretability, Useability and Applications , 2015, IJCCI.

[37]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[38]  Xiao-Jun Zeng,et al.  An improved approach of self-organising fuzzy neural network based on similarity measures , 2012, Evol. Syst..

[39]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[40]  Edwin Lughofer,et al.  Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations , 2014, Inf. Fusion.

[41]  Edwin Lughofer,et al.  Residual-based fault detection using soft computing techniques for condition monitoring at rolling mills , 2014, Inf. Sci..

[42]  Allen Y. Yang,et al.  A Convex Optimization Framework for Active Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[43]  Edwin Lughofer,et al.  On-line assurance of interpretability criteria in evolving fuzzy systems - Achievements, new concepts and open issues , 2013, Inf. Sci..

[44]  D.P. Filev,et al.  An approach to online identification of Takagi-Sugeno fuzzy models , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[46]  Plamen Angelov,et al.  Evolving Intelligent Systems: Methodology and Applications , 2010 .

[47]  Plamen P. Angelov,et al.  Evolving classification of agents’ behaviors: a general approach , 2010, Evol. Syst..

[48]  Hai-Jun Rong,et al.  Sequential Adaptive Fuzzy Inference System for Function Approximation Problems , 2012 .

[49]  Dejan Dovzan,et al.  Implementation of an Evolving Fuzzy Model (eFuMo) in a Monitoring System for a Waste-Water Treatment Process , 2015, IEEE Transactions on Fuzzy Systems.

[50]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[51]  Shinichi Nakajima,et al.  Pool-based active learning in approximate linear regression , 2009, Machine Learning.

[52]  Witold Pedrycz,et al.  Fuzzy Systems Engineering - Toward Human-Centric Computing , 2007 .

[53]  Edwin Lughofer,et al.  Learning in Non-Stationary Environments: Methods and Applications , 2012 .

[54]  Mahardhika Pratama,et al.  Recurrent Classifier Based on an Incremental Metacognitive-Based Scaffolding Algorithm , 2015, IEEE Transactions on Fuzzy Systems.

[55]  Wei Hu,et al.  Unsupervised Active Learning Based on Hierarchical Graph-Theoretic Clustering , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[56]  Mineichi Kudo,et al.  Data compression by volume prototypes for streaming data , 2010, Pattern Recognit..

[57]  V. Sugumaran,et al.  Machine learning approach for automated visual inspection of machine components , 2011, Expert Syst. Appl..

[58]  Edwin Lughofer,et al.  Incremental and decremental active learning for optimized self-adaptive calibration in viscose production , 2014 .

[59]  Degang Chen,et al.  Active Sample Selection Based Incremental Algorithm for Attribute Reduction With Rough Sets , 2017, IEEE Transactions on Fuzzy Systems.

[60]  Babak Nadjar Araabi,et al.  Recursive Gath-Geva clustering as a basis for evolving neuro-fuzzy modeling , 2010, International Conference on Fuzzy Systems.

[61]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[62]  Edwin Lughofer,et al.  FLEXFIS: A Robust Incremental Learning Approach for Evolving Takagi–Sugeno Fuzzy Models , 2008, IEEE Transactions on Fuzzy Systems.

[63]  Gentiane Venture,et al.  Calibration and Parameter Estimation , 2018, Humanoid Robotics: A Reference.

[64]  R. French Catastrophic Forgetting in Connectionist Networks , 2006 .

[65]  B. Roy Frieden,et al.  Science from Fisher Information: A Unification , 2004 .

[66]  Andries Petrus Engelbrecht,et al.  Adaptive Genetic Programming for dynamic classification problems , 2009, 2009 IEEE Congress on Evolutionary Computation.

[67]  Marimuthu Palaniswami,et al.  Evolving Fuzzy Rules for Anomaly Detection in Data Streams , 2015, IEEE Transactions on Fuzzy Systems.

[68]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[69]  Jia-Ling Koh,et al.  Concept Shift Detection for Frequent Itemsets from Sliding Windows over Data Streams , 2009, DASFAA Workshops.

[70]  Igor Škrjanc,et al.  Confidence interval of fuzzy models: An example using a waste-water treatment plant , 2009 .

[71]  Ate Poorthuis,et al.  Modeling User Behavior in Adoption and Diffusion of Twitter Clients , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[72]  Plamen Angelov,et al.  Autonomous Learning Systems: From Data Streams to Knowledge in Real-time , 2013 .

[73]  Jared Dean,et al.  Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners , 2014 .

[74]  Naif Alajlan,et al.  Active learning for spectroscopic data regression , 2012 .

[75]  Edwin Lughofer,et al.  Machine learning based analysis of gender differences in visual inspection decision making , 2013, Inf. Sci..

[76]  Edwin Lughofer,et al.  On-Line Fault Detection with Data-Driven Evolving Fuzzy Models , 2008, Control. Intell. Syst..

[77]  Plamen P. Angelov,et al.  PANFIS: A Novel Incremental Learning Machine , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[78]  James Theiler,et al.  Accurate On-line Support Vector Regression , 2003, Neural Computation.

[79]  Stefan Jakubek,et al.  Analytic Multilayer Perceptron based Experiment Design for Nonlinear Systems , 2011 .

[80]  Narasimhan Sundararajan,et al.  A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation , 2005, IEEE Transactions on Neural Networks.

[81]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..

[82]  Éric Anquetil,et al.  Improving premise structure in evolving Takagi–Sugeno neuro-fuzzy classifiers , 2011, Evol. Syst..

[83]  Nikola Kasabov,et al.  Dynamic Learning of Multiple Time Series in a Nonstationary Environment , 2012 .

[84]  Nils Rosemann,et al.  Enforcing Local Properties in Online Learning First Order TS-fuzzy Systems by Incremental Regularization , 2009, IFSA/EUSFLAT Conf..

[85]  Jian-Ping Mei,et al.  Incremental Fuzzy Clustering With Multiple Medoids for Large Data , 2014, IEEE Transactions on Fuzzy Systems.

[86]  Yunhui Liu,et al.  Robust Exemplar Extraction Using Structured Sparse Coding , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[87]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.