A Parsimonious Mixture of Gaussian Trees Model for Oversampling in Imbalanced and Multimodal Time-Series Classification

We propose a novel framework of using a parsimonious statistical model, known as mixture of Gaussian trees, for modeling the possibly multimodal minority class to solve the problem of imbalanced time-series classification. By exploiting the fact that close-by time points are highly correlated due to smoothness of the time-series, our model significantly reduces the number of covariance parameters to be estimated from O(d2) to O(Ld), where L is the number of mixture components and d is the dimensionality. Thus, our model is particularly effective for modeling high-dimensional time-series with limited number of instances in the minority positive class. In addition, the computational complexity for learning the model is only of the order O(Ln+d2) where n+ is the number of positively labeled samples. We conduct extensive classification experiments based on several well-known time-series data sets (both singleand multimodal) by first randomly generating synthetic instances from our learned mixture model to correct the imbalance. We then compare our results with several state-of-the-art oversampling techniques and the results demonstrate that when our proposed model is used in oversampling, the same support vector machines classifier achieves much better classification accuracy across the range of data sets. In fact, the proposed method achieves the best average performance 30 times out of 36 multimodal data sets according to the F-value metric. Our results are also highly competitive compared with nonoversampling-based classifiers for dealing with imbalanced time-series data sets.

[1]  Vincent Y. F. Tan,et al.  Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures , 2009, IEEE Transactions on Signal Processing.

[2]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[3]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[4]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[5]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[6]  Yanqing Zhang,et al.  SVMs Modeling for Highly Imbalanced Classification , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Longin Jan Latecki,et al.  Improving SVM classification on imbalanced time series data sets with ghost points , 2011, Knowledge and Information Systems.

[8]  Vincent Y. F. Tan,et al.  Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates , 2010, J. Mach. Learn. Res..

[9]  M. N. Nguyen,et al.  pro-Positive Unlabeled Learning for Time Series Classification , 2022 .

[10]  Haibo He,et al.  ADASYN: Adaptive synthetic sampling approach for imbalanced learning , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[14]  Vincent Y. F. Tan,et al.  Learning Graphical Models for Hypothesis Testing and Classification , 2010, IEEE Transactions on Signal Processing.

[15]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[16]  Klaus-Uwe Höffgen,et al.  Learning and robust learning of product distributions , 1993, COLT '93.

[17]  Alex ChiChung Kot,et al.  Manipulation Detection on Image Patches Using FusionBoost , 2012, IEEE Transactions on Information Forensics and Security.

[18]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[19]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[20]  Hui Han,et al.  Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning , 2005, ICIC.

[21]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).

[22]  See-Kiong Ng,et al.  SPO: Structure Preserving Oversampling for Imbalanced Time Series Classification , 2011, 2011 IEEE 11th International Conference on Data Mining.

[23]  Herna L. Viktor,et al.  Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach , 2004, SKDD.

[24]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[25]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[26]  Markus Svensén,et al.  Beyond atopy: multiple patterns of sensitization in relation to asthma in a birth cohort study. , 2010, American journal of respiratory and critical care medicine.

[27]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[28]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[29]  M. Maloof Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown , 2003 .

[30]  See-Kiong Ng,et al.  Integrated Oversampling for Imbalanced Time Series Classification , 2013, IEEE Transactions on Knowledge and Data Engineering.

[31]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[32]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[33]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[34]  Xiaoli Li,et al.  An integrated framework for human activity classification , 2012, UbiComp.

[35]  Michael I. Jordan Graphical Models , 2003 .

[36]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[37]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[38]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[39]  Michael I. Jordan,et al.  Learning graphical models for stationary time series , 2004, IEEE Transactions on Signal Processing.

[40]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[41]  Volker Märgner,et al.  Density-induced oversampling for highly imbalanced datasets , 2013, Electronic Imaging.

[42]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.