Exploiting Ordinal Class Structure in Multiclass Classification: Application to Ovarian Cancer

In multiclass machine learning problems, one needs to distinguish between the nominal labels that do not have any natural ordering and the ordinal labels that are ordered. Ordinal labels are pervasive in biology, and some examples are given here. In this note, we point out the importance of making use of the order information when it is inherent to the problem. We demonstrate that algorithms that use this additional information outperform the algorithms that do not, on a case study of assigning one of four labels to the ovarian cancer patients on the basis of their time of progression-free survival. As an aside, it is also pointed out that the algorithms that make use of ordering information require fewer data normalizations. This aspect is important in biological applications, where data are plagued by variations in platforms and protocols, batch effects, and so on.

[1]  M Vidyasagar,et al.  Machine learning methods in the computational biology of cancer , 2015, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[2]  V. Apgar A proposal for a new method of evaluation of the newborn infant. , 1953, Current researches in anesthesia & analgesia.

[3]  Eibe Frank,et al.  A Simple Approach to Ordinal Classification , 2001, ECML.

[4]  A Donner,et al.  Are ordinal models useful for classification? , 1991, Statistics in medicine.

[5]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[6]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[7]  Jaime S. Cardoso,et al.  Learning to Classify Ordinal Data: The Data Replication Method , 2007, J. Mach. Learn. Res..

[8]  Eyke Hüllermeier,et al.  Is an ordinal class structure useful in classifier learning? , 2008, Int. J. Data Min. Model. Manag..

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  W. Hunt,et al.  Surgical risk as related to time of intervention in the repair of intracranial aneurysms. , 1968, Journal of neurosurgery.

[11]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[12]  R. Tothill,et al.  Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome , 2008, Clinical Cancer Research.

[13]  E. Lesaffre,et al.  Are ordinal models useful for classification? a revised analysis , 1995 .

[14]  Ituro Inoue,et al.  Gene Expression Profile for Predicting Survival in Advanced-Stage Serous Ovarian Cancer Across Two Independent Datasets , 2010, PloS one.

[15]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[16]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .