A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques

Data mining is an important research activity in the field of medical sciences since there is a requirement of efficient methodologies for analyzing and detecting diseases. Data mining applications are used for the management of healthcare, health information, patient care system, etc. It also plays a major role in analyzing survivability of a disease. Classification and clustering are the popular data mining techniques used to understand the various parameters of the health data set. In this research work, various classification models are used to classify thyroid disease based on the parameters like TSH, T4U and goiter. Several classification techniques like K-nearest neighbour, support vector machine and Naive Bayes are used. The experimental study has been conducted using Rapid miner tool and the results shows that the accuracy of K-nearest neighbour is better than Naive Bayes to detect thyroid disease.

[1]  Y. Alp Aslandogan,et al.  Evidence combination in medical data mining , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[2]  Beng Chin Ooi,et al.  BORDER: efficient computation of boundary points , 2006, IEEE Transactions on Knowledge and Data Engineering.

[3]  Roohallah Alizadehsani,et al.  Diagnosis of Coronary Artery Disease Using Cost-Sensitive Algorithms , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[4]  Keshab K. Parhi,et al.  DREAM: Diabetic Retinopathy Analysis Using Machine Learning , 2014, IEEE Journal of Biomedical and Health Informatics.

[5]  A. V. Deorankar,et al.  Diabetic Retinopathy using morphological operations and machine learning , 2015, 2015 IEEE International Advance Computing Conference (IACC).

[6]  Yihua Lan,et al.  A Hybrid Classifier for Mammography CAD , 2012, 2012 Fourth International Conference on Computational and Information Sciences.

[7]  A. V. Hudli,et al.  Application of data mining to candidate screening , 2012, 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT).

[8]  Björn Eskofier,et al.  Using wearable sensors for semiology-independent seizure detection - towards ambulatory monitoring of epilepsy , 2015, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[9]  Iqbal Gondal,et al.  K-ranked covariance based missing values estimation for microarray data classification , 2004, Fourth International Conference on Hybrid Intelligent Systems (HIS'04).

[10]  Davut Hanbay,et al.  Classification of breast masses in mammogram images using KNN , 2015, 2015 23nd Signal Processing and Communications Applications Conference (SIU).

[11]  Victor Murray,et al.  Classification of surface electromyographic signals using AM-FM features , 2009, 2009 9th International Conference on Information Technology and Applications in Biomedicine.

[12]  Puneet Bansal,et al.  Classification of heart diseases from ECG signals using wavelet transform and kNN classifier , 2015, International Conference on Computing, Communication & Automation.

[13]  Mohamad Khalil,et al.  Driver stress level detection using HRV analysis , 2015, 2015 International Conference on Advances in Biomedical Engineering (ICABME).

[14]  Subramanian Appavu,et al.  An amalgam KNN to predict diabetes mellitus , 2013, 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN).

[15]  Raija Korpelainen,et al.  Detecting and profiling sedentary young men using machine learning algorithms , 2014, 2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[16]  Mahrokh G. Shayesteh,et al.  Classification of brain MRI using multi-cluster feature selection and KNN classifier , 2015, 2015 23rd Iranian Conference on Electrical Engineering.

[17]  Arslan Shaukat,et al.  Identifying best feature subset for cardiac arrhythmia classification , 2015, 2015 Science and Information Conference (SAI).

[18]  Bin Hu,et al.  Feature selection of high-dimensional biomedical data using improved SFLA for disease diagnosis , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[19]  Howard Poizner,et al.  Correlation Among Joint Motions Allows Classification of Parkinsonian Versus Normal 3-D Reaching , 2010, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[20]  Jithendra Vepa,et al.  Classification of heart murmurs using cepstral features and support vector machines , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[21]  Han Liu,et al.  Comparison of discrimination methods for peptide classification in tandem mass spectrometry , 2004, 2004 Symposium on Computational Intelligence in Bioinformatics and Computational Biology.

[22]  M. S. Prasasd Babu,et al.  Artificial immune recognition systems in medical diagnosis , 2015, 2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS).

[23]  R. Arefi Shirvan,et al.  Voice analysis for detecting Parkinson's disease using genetic algorithm and KNN classification method , 2011, 2011 18th Iranian Conference of Biomedical Engineering (ICBME).

[24]  Su Liangliang,et al.  The classification of gene expression profile based on the adjacency matrix spectral decomposition , 2010, 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).

[25]  Huzefa Rangwala,et al.  Analysis of Microbiome Data across Inflammatory Bowel Disease Patients , 2011, 2011 10th International Conference on Machine Learning and Applications and Workshops.

[26]  Da Guo,et al.  Research on optimal Traditional Chinese Medicine treatment of knee ostarthritis with data mining algorithms , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops.

[27]  Liu Han,et al.  The research of missing value estimation of gene sequence based on improved KNN , 2009, 2009 4th International Conference on Computer Science & Education.

[28]  N. Thulasi,et al.  Automated diagnosis of glaucoma using Haralick texture features , 2014, International Conference on Information Communication and Embedded Systems (ICICES2014).

[29]  Arvind Kumar Tiwari,et al.  Feature based classification of nuclear receptors and their subfamilies using fuzzy K-nearest neighbor , 2015, 2015 International Conference on Advances in Computer Engineering and Applications.

[30]  Mohammad Sadegh Helfroush,et al.  Classification of liver diseases using ultrasound images based on feature combination , 2014, 2014 4th International Conference on Computer and Knowledge Engineering (ICCKE).

[31]  Wahyu Caesarendra,et al.  Pattern recognition methods for multi stage classification of parkinson's disease utilizing voice features , 2015, 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM).

[32]  Mohammad Saleh Nambakhsh,et al.  Morphological Heart Arrhythmia Detection Using Hermitian Basis Functions and kNN Classifier , 2006, 2006 International Conference of the IEEE Engineering in Medicine and Biology Society.

[33]  Nagamma Patil,et al.  An Improved Method for Disease Prediction Using Fuzzy Approach , 2015, 2015 Second International Conference on Advances in Computing and Communication Engineering.

[34]  Yaping Lin,et al.  Gene expression data classification using SVM-KNN classifier , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[35]  Jing Shen,et al.  Incremental Tumor Diagnosis Algorithm Using New Unlabeled Microarray , 2008, 2008 International Multi-symposiums on Computer and Computational Sciences.

[36]  Maheshkumar H. Kolekar,et al.  Diagnosis of diseases on cotton leaves using principal component analysis classifier , 2014, 2014 Annual IEEE India Conference (INDICON).

[37]  Behzad Nazari,et al.  The detection of Dacrocyte, Schistocyte and Elliptocyte cells in Iron Deficiency Anemia , 2015, 2015 2nd International Conference on Pattern Recognition and Image Analysis (IPRIA).