Mining Health Data for Breast Cancer Diagnosis Using Machine Learning

..................................................................................................................... iii Acknowledgements .................................................................................................... ix Table of

[1]  Donald A. Adjeroh,et al.  Random KNN feature selection - a fast and stable alternative to Random Forests , 2011, BMC Bioinformatics.

[2]  S Biafore,et al.  Predictive solutions bring more power to decision makers. , 1999, Health management technology.

[3]  Phayung Meesad,et al.  Combined numerical and linguistic knowledge representation and its application to medical diagnosis , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[4]  Gwi-Tae Park,et al.  A methodology of computer aided diagnostic system on breast cancer , 2005, Proceedings of 2005 IEEE Conference on Control Applications, 2005. CCA 2005..

[5]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[6]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[7]  Elif Derya Übeyli Adaptive Neuro-Fuzzy Inference Systems for Automatic Detection of Breast Cancer , 2009, Journal of Medical Systems.

[8]  K. Ramar,et al.  Enhancing Classifier Performance Via Hybrid Feature Selection and Numeric Class Handling- A Comparative Study , 2012 .

[9]  J. Hallick,et al.  Analytics and the data warehouse. , 2001, Health management technology.

[10]  Benjamin M. Marlin,et al.  Missing Data Problems in Machine Learning , 2008 .

[11]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[12]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[13]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[14]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[15]  Ivan Bratko,et al.  Experiments in automatic learning of medical diagnostic rules , 1984 .

[16]  William B. Langdon,et al.  Data Fusion by Intelligent Classifier Combination , 2001 .

[17]  David A. Klein,et al.  A Continuous Real-Time Expert System for Computer Operations , 1986, IBM J. Res. Dev..

[18]  F. Paulin An Algorithm to Reconstruct the Missing Values for Diagnosing the Breast Cancer , 2010 .

[19]  Jacek M. Zurada,et al.  Artificial Intelligence and Soft Computing , 2014, Lecture Notes in Computer Science.

[20]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[21]  Alex M. Andrew,et al.  Intelligent Hybrid Systems , 1999 .

[22]  Ronen Feldman,et al.  The Data Mining and Knowledge Discovery Handbook , 2005 .

[23]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[24]  A. P. White,et al.  Probabilistic induction by dynamic part generation in virtual trees , 1987 .

[25]  P. Corey,et al.  Incidence of Adverse Drug Reactions in Hospitalized Patients , 2012 .

[26]  Renfa Li,et al.  A Novel Hybrid Method for Gene Selection of Microarray Data , 2011 .

[27]  H. Koh,et al.  Data mining applications in healthcare. , 2005, Journal of healthcare information management : JHIM.

[28]  U. Reinhardt,et al.  Health care spending and use of information technology in OECD countries. , 2006, Health affairs.

[29]  Robin Parker,et al.  Missing Data Problems in Machine Learning , 2010 .

[30]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[31]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[32]  Harry Zhang,et al.  Naive Bayes for optimal ranking , 2008, J. Exp. Theor. Artif. Intell..

[33]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[34]  Krzysztof J. Cios,et al.  Uniqueness of medical data mining , 2002, Artif. Intell. Medicine.

[35]  N. Terry,et al.  The Emergence of National Electronic Health Record Architectures in the United States and Australia: Models, Costs, and Questions , 2005, Journal of medical Internet research.

[36]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[37]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[38]  Dat Tran,et al.  A New Approach for Constructing Missing Features Values , 2012 .

[39]  Xu Huang,et al.  Information gain and adaptive neuro-fuzzy inference system for breast cancer diagnoses , 2010, 5th International Conference on Computer Sciences and Convergence Information Technology.

[40]  David J. Hand,et al.  Data Mining: Statistics and More? , 1998 .

[41]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[42]  Ludmila I. Kuncheva,et al.  Feature Subsets for Classifier Combination: An Enumerative Experiment , 2001, Multiple Classifier Systems.

[43]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[44]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[45]  Harold Sackman,et al.  Biomedical information technology , 2008 .

[46]  M. Radmacher,et al.  Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. , 2003, Journal of the National Cancer Institute.

[47]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[48]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[49]  David C. Howell,et al.  The Treatment of Missing Data , 2007 .

[50]  William G. Baxt,et al.  Use of an Artificial Neural Network for Data Analysis in Clinical Decision-Making: The Diagnosis of Acute Coronary Occlusion , 1990, Neural Computation.

[51]  Jonathan Pevsner,et al.  Bioinformatics and functional genomics , 2003 .

[52]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[53]  Johannes Gehrke,et al.  Data Mining with Decision Trees , 2000, ICDE.

[54]  Andreas Holzinger,et al.  Data Mining with Decision Trees: Theory and Applications , 2015, Online Inf. Rev..

[55]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[56]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[57]  Sorin Draghici,et al.  Machine Learning and Its Applications to Biology , 2007, PLoS Comput. Biol..

[58]  N. Schenker,et al.  Maximum likelihood estimation for linear regression models with right censored outcomes and missing predictors , 1999 .

[59]  Sebastian Thrun,et al.  The MONK''s Problems-A Performance Comparison of Different Learning Algorithms, CMU-CS-91-197, Sch , 1991 .

[60]  Abdesselam Bouzerdoum,et al.  Application of shunting inhibitory artificial neural networks to medical diagnosis , 2001, The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001.

[61]  G W Moore,et al.  A prototype Internet autopsy database. 1625 consecutive fetal and neonatal autopsy facesheets spanning 20 years. , 1996, Archives of pathology & laboratory medicine.

[62]  Peter Grabusts,et al.  The Choice of Metrics for Clustering Algorithms , 2015 .

[63]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[64]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[65]  Lefteris Angelis,et al.  Selective fusion of heterogeneous classifiers , 2005, Intell. Data Anal..

[66]  Rudy Setiono,et al.  Generating concise and accurate classification rules for breast cancer diagnosis , 2000, Artif. Intell. Medicine.

[67]  Douglas G Altman,et al.  Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study , 2010, BMC medical research methodology.

[68]  Jerzy W. Grzymala-Busse,et al.  Data with Missing Attribute Values: Generalization of Indiscernibility Relation and Rule Induction , 2004, Trans. Rough Sets.

[69]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[70]  Despina Deligiorgi,et al.  Spatial Interpolation Methodologies in Urban Air Pollution Modeling: Application for the Greater Area of Metropolitan Athens, Greece , 2011 .

[71]  D R Dakins Center takes data tracking to heart. , 2001, Health data management.

[72]  Weixin Xie,et al.  Novel Hybrid Feature Selection Algorithms for Diagnosing Erythemato-Squamous Diseases , 2012, HIS.

[73]  Hamid Parvin,et al.  MKNN: Modified K-Nearest Neighbor , 2008 .

[74]  Carlos Gershenson,et al.  Artificial Neural Networks for Beginners , 2003, ArXiv.

[75]  Qi Shen,et al.  Hybridized KNN and SVM for gene expression data classification , 2005 .

[76]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[77]  Bernard Zenko,et al.  Is Combining Classifiers with Stacking Better than Selecting the Best One? , 2004, Machine Learning.

[78]  Walter Cedeño,et al.  Using particle swarms for the development of QSAR models based on K-nearest neighbor and kernel regression , 2003, J. Comput. Aided Mol. Des..

[79]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[80]  Cesare Furlanello,et al.  Canberra distance on ranked lists , 2009 .

[81]  D. Rubin INFERENCE AND MISSING DATA , 1975 .

[82]  S. Murugesan,et al.  Electronic medical prescription: an overview of current status and issues , 2010 .