Diagnosis of gastric carcinoma by classification on feature projections

A new classification algorithm, called benefit maximizing classifier on feature projections (BCFP), is developed and applied to the problem of diagnosis of gastric carcinoma. The domain contains records of patients with known diagnosis through gastroscopy results. Given a training set of such records, the BCFP classifier learns how to differentiate a new case in the domain. BCFP represents a concept in the form of feature projections on each feature dimension separately. Classification in the BCFP algorithm is based on a voting among the individual predictions made on each feature. In the gastric carcinoma domain, a lesion can be an indicator of one of nine different levels of gastric carcinoma, from early to late stages. The benefit of correct classification of early levels is much more than that of late cases. Also, the costs of wrong classifications are not symmetric. In the training phase, the BCFP algorithm learns classification rules that maximize the benefit of classification. In the querying phase, using these rules, the BCFP algorithm tries to make a prediction maximizing the benefit. A genetic algorithm is applied to select the relevant features. The performance of the BCFP algorithm is evaluated in terms of accuracy and running time. The rules induced are verified by experts of the domain.

[1]  T. Saito,et al.  Borrmann's type IV gastric cancer: clinicopathologic analysis. , 1999, Canadian journal of surgery. Journal canadien de chirurgie.

[2]  Hayato Kurihara Detection of Early Gastric Cancer Outside the Mass Screening Program , 1998 .

[3]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[4]  Necati Örmeci,et al.  Early Stomach Cancer in Turkey , 1993 .

[5]  Igor Kononenko,et al.  Inductive and Bayesian learning in medical diagnosis , 1993, Appl. Artif. Intell..

[6]  H. A. Guvenir,et al.  A supervised machine learning algorithm for arrhythmia analysis , 1997, Computers in Cardiology 1997.

[7]  J. Ross Quinlan,et al.  Unknown Attribute Values in Induction , 1989, ML.

[8]  Tanaki Kajitani,et al.  The general rules for the gastric cancer study in surgery and pathology , 1981, The Japanese journal of surgery.

[9]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[10]  H. Altay Güvenir,et al.  Classification by Voting Feature Intervals , 1997, ECML.

[11]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[12]  Kai Ming Ting,et al.  A Study of AdaBoost with Naive Bayesian Classifiers: Weakness and Improvement , 2003, Comput. Intell..

[13]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[14]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[15]  H. A. Guvenir,et al.  Classification by Feature Partitioning , 1996, Machine Learning.

[16]  H. Altay Güvenir,et al.  Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals , 1998, Artif. Intell. Medicine.

[17]  H. Yamabe,et al.  A clinicopathological analysis of early gastric cancer: retrospective study with special reference to lymph node metastasis. , 1994, Cancer detection and prevention.

[18]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[19]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[20]  Toshio Takahashi,et al.  Recent Advances in Management of Digestive Cancers , 1993, Springer Japan.

[21]  H. A. Güvenira,et al.  An expert system for the differential diagnosis of erythemato-squamous diseases , 1999 .