Building a Medical Decision Support System for Colon Polyp Screening by Using Fuzzy Classification Trees

To deal with highly uncertain and noisy data, for example, biochemical laboratory examinations, a classifier is required to be able to classify an instance into all possible classes and each class is associated with a degree which shows how possible an instance is in that class. According to these degrees, we can discriminate the more possible classes from the less possible classes. The classifier or an expert can pick the most possible one to be the instance class. However, if their discrimination is not distinguishable, it is better that the classifier should not make any prediction, especially when there is incomplete or inadequate data. A fuzzy classifier is proposed to classify the data with noise and uncertainties. Instead of determining a single class for a given instance, fuzzy classification predicts the degree of possibility for every class.Adenomatous polyps are widely accepted to be precancerous lesions and will degenerate into cancers ultimately. Therefore, it is important to generate a predictive method that can identify the patients who have obtained polyps and remove the lesions of them. Considering the uncertainties and noise in the biochemical laboratory examination data, fuzzy classification trees, which integrate decision tree techniques and fuzzy classifications, provide the efficient way to classify the data in order to generate the model for polyp screening.

[1]  J. R. Quinlan Probabilistic decision trees , 1990 .

[2]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[3]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[4]  Allan P. White,et al.  Technical Note: Bias in Information-Based Measures in Decision Tree Induction , 1994, Machine Learning.

[5]  Krzysztof J. Cios,et al.  Continuous ID3 algorithm with fuzzy entropy measures , 1992, [1992 Proceedings] IEEE International Conference on Fuzzy Systems.

[6]  Randy Kerber,et al.  ChiMerge: Discretization of Numeric Attributes , 1992, AAAI.

[7]  Settimo Termini,et al.  A Definition of a Nonprobabilistic Entropy in the Setting of Fuzzy Sets Theory , 1972, Inf. Control..

[8]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[9]  Ron Rymon An SE-tree based Characterization of the Induction Problem , 1993, ICML.

[10]  Steven L. Salzberg,et al.  On growing better decision trees from data , 1996 .

[11]  Peter Clark,et al.  The CN2 induction algorithm , 2004, Machine Learning.

[12]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[13]  M. Sugeno,et al.  Structure identification of fuzzy model , 1988 .

[14]  W. Peizhuang Pattern Recognition with Fuzzy Objective Function Algorithms (James C. Bezdek) , 1983 .

[15]  Simon Kasif,et al.  OC1: A Randomized Induction of Oblique Decision Trees , 1993, AAAI.

[16]  H. J. Larson,et al.  Introduction to the Theory of Statistics , 1973 .

[17]  Michael I. Jordan,et al.  Supervised Learning and Divide-and-Conquer: A Statistical Approach , 1993, ICML.

[18]  Jane Yung-jen Hsu,et al.  Fuzzy classification trees for data analysis , 2002, Fuzzy Sets Syst..

[19]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[20]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[21]  Wei Zhong Liu,et al.  Bias in information-based measures in decision tree induction , 1994, Machine Learning.

[22]  J. Wolfowitz,et al.  Introduction to the Theory of Statistics. , 1951 .

[23]  Louis Wehenkel,et al.  Automatic induction of fuzzy decision trees and its application to power system security assessment , 1999, Fuzzy Sets Syst..

[24]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[25]  Wray L. Buntine Myths and Legends in Learning Classification Rules , 1990, AAAI.

[26]  G. Hoff,et al.  Colonoscopic screening examination of relatives of patients with colorectal cancer. II. Relations between tumour characteristics and the presence of polyps. , 1992, Scandinavian journal of gastroenterology.

[27]  Russell Greiner,et al.  Exploring the Decision Forest: An Empirical Investigation of Occam's Razor in Decision Tree Induction , 1997 .

[28]  Wray L. Buntine,et al.  Learning classification trees , 1992 .

[29]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[30]  Alberto Suárez,et al.  Globally Optimal Fuzzy Decision Trees for Classification and Regression , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Usama M. Fayyad,et al.  On the Handling of Continuous-Valued Attributes in Decision Tree Generation , 1992, Machine Learning.

[32]  G. Hoff,et al.  Colonoscopic screening examination of relatives of patients with colorectal cancer. I. A comparison with an endoscopically screened normal population. , 1992, Scandinavian journal of gastroenterology.

[33]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[34]  John Mingers,et al.  An empirical comparison of selection measures for decision-tree induction , 2004, Machine Learning.

[35]  Paul W. Baim A Method for Attribute Selection in Inductive Learning Systems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[37]  Ming-Jium Shieh,et al.  Prevalence of Colorectal Polyps in Taiwan: 60cm-Sigmoidoscopic Findings , 1995 .

[38]  D W Day,et al.  Polyps and cancer of the large bowel: a necropsy study in Liverpool. , 1982, Gut.

[39]  J. Ross Quinlan,et al.  Decision trees and decision-making , 1990, IEEE Trans. Syst. Man Cybern..

[40]  Jane Yung-jen Hsu,et al.  Fuzzy classification trees , 1996, Proceedings Mexico-USA Collaboration in Intelligent Systems Technologies..

[41]  Jane Yung-jen Hsu,et al.  Integration of fuzzy classifiers with decision trees , 1996, Soft Computing in Intelligent Systems and Information Processing. Proceedings of the 1996 Asian Fuzzy Systems Symposium.

[42]  Larry A. Rendell,et al.  Empirical learning as a function of concept character , 2004, Machine Learning.

[43]  J. Yerushalmy Statistical problems in assessing methods of medical diagnosis, with special reference to X-ray techniques. , 1947, Public health reports.

[44]  M. Pazzani,et al.  The Utility of Knowledge in Inductive Learning , 1992, Machine Learning.

[45]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[46]  Witold Pedrycz,et al.  The design of decision trees in the framework of granular data and their application to software quality models , 2001, Fuzzy Sets Syst..

[47]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[48]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[49]  George Nagy,et al.  Decision tree design using a probabilistic model , 1984, IEEE Trans. Inf. Theory.

[50]  Wolfgang Doster,et al.  A decision theoretic approach to hierarchical classifier design , 1984, Pattern Recognit..

[51]  Nobuaki Sasano,et al.  Polyps and diverticulosis of large bowel in autopsy population of Akita prefecture, compared with Miyagi. High risk for colorectal cancer in Japan , 1976 .

[52]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[53]  Ryszard S. Michalski,et al.  Learning flexible concepts: fundamental ideas and a method based on two-tiered representation , 1990 .

[54]  Franklin A. Graybill,et al.  Introduction to the Theory of Statistics, 3rd ed. , 1974 .

[55]  Simon Kasif,et al.  Induction of Oblique Decision Trees , 1993, IJCAI.

[56]  Cezary Z. Janikow,et al.  Fuzzy decision trees: issues and methods , 1998, IEEE Trans. Syst. Man Cybern. Part B.