Relevance as a Metric for Evaluating Machine Learning Algorithms

In machine learning, the choice of a learning algorithm that is suitable for the application domain is critical. The performance metric used to compare different algorithms must also reflect the concerns of users in the application domain under consideration. In this paper, we propose a novel probability-based performance metric called Relevance Score for evaluating supervised learning algorithms. We evaluate the proposed metric through empirical analysis on a dataset gathered from an intelligent lighting pilot installation. In comparison to the commonly used Classification Accuracy metric, the Relevance Score proves to be more appropriate for a certain class of applications.

[1]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[2]  Brian R. Gaines,et al.  Induction of ripple-down rules applied to modeling large databases , 1995, Journal of Intelligent Information Systems.

[3]  S. Salzberg,et al.  INSTANCE-BASED LEARNING : Nearest Neighbour with Generalisation , 1995 .

[4]  Jan Friso Groote,et al.  Evaluating the Effect of Formal Techniques in Industry , 2012 .

[5]  Antonio Liotta,et al.  Exploiting machine learning for intelligent room lighting applications , 2012, 2012 6th IEEE International Conference Intelligent Systems.

[6]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[7]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[8]  Stefan Wrobel,et al.  Machine Learning: ECML-95 , 1995, Lecture Notes in Computer Science.

[9]  Ian Witten,et al.  Data Mining , 2000 .

[10]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[11]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[12]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[13]  Abdul Sattar,et al.  AI 2006: Advances in Artificial Intelligence, 19th Australian Joint Conference on Artificial Intelligence, Hobart, Australia, December 4-8, 2006, Proceedings , 2006, Australian Conference on Artificial Intelligence.

[14]  Neha Mehra,et al.  Survey on Multiclass Classification Methods , 2013 .

[15]  Grigorios Tsoumakas,et al.  Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[16]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[17]  Ron Kohavi,et al.  The Power of Decision Tables , 1995, ECML.

[18]  N. Japkowicz Why Question Machine Learning Evaluation Methods ? ( An illustrative review of the shortcomings of current methods ) , 2006 .

[19]  Pjl Pieter Cuijpers,et al.  Revised budget allocations for fixed-priority-scheduled periodic resources , 2012 .

[20]  Stan Szpakowicz,et al.  Beyond Accuracy, F-Score and ROC: A Family of Discriminant Measures for Performance Evaluation , 2006, Australian Conference on Artificial Intelligence.

[21]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.