Combinatorial refinement of feature weighting for linear classification

We present a new approach for linear classification optimisation based on Combinatorial Refinement (ComRef) of feature weighting for cognitive signal processing in resource-limited hardware and software like in Cyber-physical systems. Despite simple construction, the approach is able to connect advantages of dimensionality reduction methods and such like combining multiple classifiers resp. Bag-of-classifiers-approaches and leads to a good generalisation ability even by use of small feature sets. Regarding generalisation ability, we benchmark the performance of ComRef on several datasets from the UCI repository. Furthermore, for an industrial dataset Motor Drive Diagnosis we show the advantage of ComRef which uses Support-Vector-Machines (SVM). In this application scenario, a trustful classifier is essential, since a small number of mis-classifications could lead to motor damages.

[1]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[2]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[4]  Jian Pei,et al.  Advances in Knowledge Discovery and Data Mining: 17th Pacific-Asia Conference, PAKDD 2013, Gold Coast, Australia, April 14-17, 2013, Proceedings, Part ... / Lecture Notes in Artificial Intelligence) , 2013 .

[5]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[6]  Jon Atli Benediktsson,et al.  Multiple Classifier Systems , 2015, Lecture Notes in Computer Science.

[7]  Marko Grobelnik,et al.  Feature selection using linear classifier weights: interaction with classification models , 2004, SIGIR '04.

[8]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[9]  Kristian Kersting,et al.  Machine learning and knowledge discovery in databases, European Conference, ECML PKDD 2013, Proceedings, Part III , 2013 .

[10]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[11]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[12]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[13]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[14]  Uwe Mönks,et al.  Sensorless drive diagnosis using automated feature extraction, significance ranking and reduction , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[15]  Masoud Nikravesh,et al.  Feature Extraction - Foundations and Applications , 2006, Feature Extraction.

[16]  Vikas Sindhwani,et al.  Information Theoretic Feature Crediting in Multiclass Support Vector Machines , 2001, SDM.

[17]  Alexander J. Smola,et al.  Learning with Kernels: support vector machines, regularization, optimization, and beyond , 2001, Adaptive computation and machine learning series.

[18]  Michèle Sebag,et al.  Machine Learning and Knowledge Discovery in Databases , 2015, Lecture Notes in Computer Science.

[19]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[20]  Günther Palm,et al.  Artificial Neural Networks and Machine Learning – ICANN 2013 , 2013, Lecture Notes in Computer Science.

[21]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[22]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[23]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[24]  David G. Stork,et al.  Pattern Classification , 1973 .

[25]  Uwe Mönks,et al.  Fuzzy-Pattern-Classifier Training with Small Data Sets , 2010, IPMU.

[26]  Marko Grobelnik,et al.  Feature Selection Using Support Vector Machines , 2002 .

[27]  W AhaDavid,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for aClass of Lazy Learning Algorithms , 1997 .

[28]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[29]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[31]  F. Mora-Camino,et al.  Studies in Fuzziness and Soft Computing , 2011 .

[32]  Masoud Nikravesh,et al.  Feature Extraction: Foundations and Applications (Studies in Fuzziness and Soft Computing) , 2006 .

[33]  Uwe Mönks,et al.  Machine conditioning by importance controlled information fusion , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[34]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.