New Frontiers in Mining Complex Patterns

In this talk I will describe some approaches for efficient pattern generation as well as presentation. In particular, I will show pattern sampling algorithms that can easily be extended to structured data and an interactive embedding technique that allows users to intuitively investigate pattern collections.

[1]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[2]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[3]  Taeho Jo,et al.  Class imbalances versus small disjuncts , 2004, SKDD.

[4]  Yang Wang,et al.  Cost-sensitive boosting for classification of imbalanced data , 2007, Pattern Recognit..

[5]  Andrew Y. Ng,et al.  Learning Feature Representations with K-Means , 2012, Neural Networks: Tricks of the Trade.

[6]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[7]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[8]  Geoffrey Zweig,et al.  A segmental CRF approach to large vocabulary continuous speech recognition , 2009, 2009 IEEE Workshop on Automatic Speech Recognition & Understanding.

[9]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[10]  Nicolò Cesa-Bianchi,et al.  Synergy of multi-label hierarchical ensembles, data fusion, and cost-sensitive methods for gene functional inference , 2012, Machine Learning.

[11]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[12]  Jerzy Stefanowski,et al.  Extending Bagging for Imbalanced Data , 2013, CORES.

[13]  Changsheng Xu,et al.  Using Webcast Text for Semantic Event Detection in Broadcast Sports Video , 2008, IEEE Transactions on Multimedia.

[14]  H. Kashima,et al.  Roughly balanced bagging for imbalanced data , 2009 .

[15]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[16]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Zbigniew W. Ras,et al.  Maximum Likelihood Study for Sound Pattern Separation and Recognition , 2007, 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE'07).

[18]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[19]  Yue-Shi Lee,et al.  Cluster-based under-sampling approaches for imbalanced data distributions , 2009, Expert Syst. Appl..

[20]  Gustavo E. A. P. A. Batista,et al.  A study of the behavior of several methods for balancing machine learning training data , 2004, SKDD.

[21]  Xavier Rodet,et al.  Music Transcription with ISA and HMM , 2004, ICA.

[22]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[23]  Yan-Qing Zhang,et al.  Diversified ensemble classifiers for highly imbalanced data learning and its application in bioinformatics , 2011 .

[24]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[25]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[26]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .