Learning Interpretable Rules for Multi-label Classification

Multi-label classification (MLC) is a supervised learning problem in which, contrary to standard multiclass classification, an instance can be associated with several class labels simultaneously. In this chapter, we advocate a rule-based approach to multi-label classification. Rule learning algorithms are often employed when one is not only interested in accurate predictions, but also requires an interpretable theory that can be understood, analyzed, and qualitatively evaluated by domain experts. Ideally, by revealing patterns and regularities contained in the data, a rule-based theory yields new insights in the application domain. Recently, several authors have started to investigate how rule-based models can be used for modeling multi-label data. Discussing this task in detail, we highlight some of the problems that make rule learning considerably more challenging for MLC than for conventional classification. While mainly focusing on our own previous work, we also provide a short overview of related work in this area.

[1]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[2]  Chengqi Zhang,et al.  Association Rule Mining , 2002, Lecture Notes in Computer Science.

[3]  Eneldo Loza Mencía,et al.  Learning rules for multi-label classification: a stacking and a separate-and-conquer approach , 2016, Machine Learning.

[4]  Grigorios Tsoumakas,et al.  Introduction to the special issue on learning from multi-label data , 2012, Machine Learning.

[5]  Heiko Paulheim,et al.  Learning Semantically Coherent Rules , 2014, DMNLP@PKDD/ECML.

[6]  Johannes Fürnkranz,et al.  From Local to Global Patterns: Evaluation Issues in Rule Learning Algorithms , 2004, Local Pattern Detection.

[7]  Wouter Duivesteijn,et al.  Exceptional Model Mining , 2008, Data Mining and Knowledge Discovery.

[8]  Bart Baesens,et al.  To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms , 2013, Data Mining and Knowledge Discovery.

[9]  Seth Flaxman,et al.  European Union Regulations on Algorithmic Decision-Making and a "Right to Explanation" , 2016, AI Mag..

[10]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[11]  Wojciech Kotlowski,et al.  ENDER: a statistical framework for boosting decision rules , 2010, Data Mining and Knowledge Discovery.

[12]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[13]  Eyke Hüllermeier,et al.  On the bayes-optimality of F-measure maximizers , 2013, J. Mach. Learn. Res..

[14]  Geoffrey I. Webb,et al.  Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining , 2009, J. Mach. Learn. Res..

[15]  Bart Goethals,et al.  Frequent Set Mining , 2010, Data Mining and Knowledge Discovery Handbook.

[16]  Bo Li,et al.  Multi-label Classification based on Association Rules with Application to Scene Classification , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[17]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Johannes Fürnkranz,et al.  On cognitive preferences and the plausibility of rule-based models , 2018, Machine Learning.

[19]  Pericles A. Mitkas,et al.  Effective Rule-Based Multi-label Classification with Learning Classifier Systems , 2013, ICANNGA.

[20]  Philip J. Hayes,et al.  CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories , 1990, IAAI.

[21]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[22]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[23]  Geoffrey I. Webb Efficient search for association rules , 2000, KDD '00.

[24]  R. Mike Cameron-Jones,et al.  Avoiding Pitfalls When Learning Recursive Theories , 1993, IJCAI.

[25]  Geoffrey I. Webb Recent Progress in Learning Decision Lists by Prepending Inferred Rules , 1994 .

[26]  A. Knobbe,et al.  Supervised descriptive local pattern mining with complex target concepts , 2016 .

[27]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[28]  Yuhong Guo,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Multi-Label Classification Using Conditional Dependency Networks , 2022 .

[29]  Eneldo Loza Mencía,et al.  Stacking Label Features for Learning Multilabel Rules , 2014, Discovery Science.

[30]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[31]  Eyke Hüllermeier,et al.  On label dependence and loss minimization in multi-label classification , 2012, Machine Learning.

[32]  Johannes Fürnkranz,et al.  A Comparison of Techniques for Selecting and Combining Class Association Rules , 2008, LWA.

[33]  Niklas Lavesson,et al.  User-oriented Assessment of Classification Model Understandability , 2011, SCAI.

[34]  Donato Malerba,et al.  A Multistrategy Approach to Learning Multiple Dependent Concepts , 1996 .

[35]  Donato Malerba,et al.  Learning Recursive Theories in the Normal ILP Setting , 2003, Fundam. Informaticae.

[36]  Peter I. Cowling,et al.  Knowledge and Information Systems , 2006 .

[37]  Sunita Sarawagi,et al.  Discriminative Methods for Multi-labeled Classification , 2004, PAKDD.

[38]  Martin Atzmüller,et al.  Subgroup discovery , 2005, Künstliche Intell..

[39]  Francisco Charte,et al.  Multilabel Classification: Problem Analysis, Metrics and Techniques , 2016 .

[40]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[41]  Mohammed J. Zaki,et al.  Multi-label Lazy Associative Classification , 2007, PKDD.

[42]  Peter I. Cowling,et al.  MMAC: a new multi-class, multi-label associative classification approach , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[43]  Johannes Fürnkranz,et al.  Exploiting Anti-monotonicity of Multi-label Evaluation Measures for Inducing Multi-label Rules , 2018, PAKDD.

[44]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[45]  Alex Alves Freitas,et al.  Comprehensible classification models: a position paper , 2014, SKDD.

[46]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[47]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[48]  Sebastián Ventura,et al.  Multi‐label learning: a review of the state of the art and ongoing research , 2014, WIREs Data Mining Knowl. Discov..

[49]  DuivesteijnWouter,et al.  Exceptional Model Mining , 2016 .

[50]  Sebastián Ventura,et al.  Evolving Multi-label Classification Rules with Gene Expression Programming: A Preliminary Study , 2010, HAIS.

[51]  Johannes Fürnkranz,et al.  Multi-Label Classification with Label Constraints , 2008 .

[52]  Lior Rokach,et al.  Exploiting label dependencies for improved sample complexity , 2013, Machine Learning.

[53]  Johannes Fürnkranz,et al.  Multi-label LeGo - Enhancing Multi-label Classifiers with Local Patterns , 2012, IDA.

[54]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[55]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[56]  Johannes Fürnkranz,et al.  On the quest for optimal rule learning heuristics , 2010, Machine Learning.

[57]  Sebastián Ventura,et al.  A Tutorial on Multilabel Learning , 2015, ACM Comput. Surv..

[58]  Céline Robardet,et al.  Local Subgroup Discovery for Eliciting and Understanding New Structure-Odor Relationships , 2016, DS.

[59]  Johannes Fürnkranz,et al.  Shorter Rules Are Better, Aren't They? , 2016, DS.

[60]  Johannes Fürnkranz,et al.  From Local Patterns to Global Models: The LeGo Approach to Data Mining , 2008 .

[61]  Eyke Hüllermeier,et al.  Dependent binary relevance models for multi-label classification , 2014, Pattern Recognit..

[62]  Sung-Bae Cho,et al.  An Evolutionary Multi Label Classification using Associative Rule Mining for Spatial Preferences , 2011 .

[63]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[64]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[65]  Johannes Fürnkranz,et al.  On Cognitive Preferences and the Interpretability of Rule-based Models , 2018, ArXiv.

[66]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[67]  Grigorios Tsoumakas,et al.  Discovering and Exploiting Deterministic Label Relationships in Multi-Label Learning , 2015, KDD.

[68]  Concha Bielza,et al.  Multi-label classification with Bayesian network-based chain classifiers , 2014, Pattern Recognit. Lett..

[69]  Eyke Hüllermeier,et al.  On the Problem of Error Propagation in Classifier Chains for Multi-label Classification , 2012, GfKl.

[70]  Luc De Raedt,et al.  Multiple Predicate Learning , 1993, IJCAI.

[71]  R. E. Lee,et al.  Distribution-free multiple comparisons between successive treatments , 1995 .

[72]  Francisco Charte,et al.  LI-MLC: A Label Inference Methodology for Addressing High Dimensionality in the Label Space for Multilabel Classification , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[73]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[74]  David D. Lewis,et al.  Reuters-21578 Text Categorization Test Collection, Distribution 1.0 , 1997 .

[75]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.