Epidemiology of lung cancer and approaches for its prediction: a systematic review and analysis

BackgroundOwing to the use of tobacco and the consumption of alcohol and adulterated food, worldwide cancer incidence is increasing at an alarming and frightening rate. Since the last decade of the twentieth century, lung cancer has been the most common cancer type. This study aimed to determine the global status of lung cancer and to evaluate the use of computational methods in the early detection of lung cancer.MethodsWe used lung cancer data from the United Kingdom (UK), the United States (US), India, and Egypt. For statistical analysis, we used incidence and mortality as well as survival rates to better understand the critical state of lung cancer.ResultsIn the UK and the US, we found a significant decrease in lung cancer mortalities in the period of 1990–2014, whereas, in India and Egypt, such a decrease was not much promising. Additionally, we observed that, in the UK and the US, the survival rates of women with lung cancer were higher than those of men. We observed that the data mining and evolutionary algorithms were efficient in lung cancer detection.ConclusionsOur findings provide an inclusive understanding of the incidences, mortalities, and survival rates of lung cancer in the UK, the US, India, and Egypt. The combined use of data mining and evolutionary algorithm can be efficient in lung cancer detection.

[1]  Alok N. Choudhary,et al.  Poster: A lung cancer mortality risk calculator based on SEER data , 2011, 2011 IEEE 1st International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[2]  Michael J Thun,et al.  Lung cancer mortality in relation to age, duration of smoking, and daily cigarette consumption: results from Cancer Prevention Study II. , 2003, Cancer research.

[3]  Jing Zhao,et al.  ACOSampling: An ant colony optimization-based undersampling method for classifying imbalanced DNA microarray data , 2013, Neurocomputing.

[4]  D. Parkin 1. The fraction of cancer attributable to lifestyle and environmental factors in the UK in 2010 , 2011, British Journal of Cancer.

[5]  Bernard Rachet,et al.  40-year trends in an index of survival for all cancers combined and survival adjusted for age and sex for each cancer in England and Wales, 1971–2011: a population-based study , 2015, The Lancet.

[6]  Jacques Ferlay,et al.  Estimating the world cancer burden: Globocan 2000 , 2001, International journal of cancer.

[7]  R. Peto,et al.  The fraction of cancer attributable to lifestyle and environmental factors in the UK in 2010 , 2011, British Journal of Cancer.

[8]  A. Mohan,et al.  Lung Cancer in India. , 2021, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[9]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[10]  Kawsar Ahmed,et al.  Early detection of lung cancer risk using data mining. , 2013, Asian Pacific journal of cancer prevention : APJCP.

[11]  M. Graffar [Modern epidemiology]. , 1971, Bruxelles medical.

[12]  Xia Li,et al.  Comparative evaluation of support vector machines for computer aided diagnosis of lung cancer in CT based on a multi-dimensional data set , 2013, Comput. Methods Programs Biomed..

[13]  L. Robinson,et al.  Optimization of Lung Cancer using Modern Data Mining Techniques , 2014 .

[14]  R. Renuka,et al.  On Intuitionistic Fuzzy β-Almost Compactness and β-Nearly Compactness , 2015, TheScientificWorldJournal.

[15]  Gugulothu Narsimha,et al.  Diagnosis of Lung Cancer Prediction System Using Data Mining Classification Techniques , 2013 .

[16]  B. Stewart,et al.  World Cancer Report , 2003 .

[17]  C. Mathers,et al.  Cancer incidence and mortality worldwide: Sources, methods and major patterns in GLOBOCAN 2012 , 2015, International journal of cancer.

[18]  Kun-Huang Chen,et al.  Applying particle swarm optimization-based decision tree classifier for cancer classification on gene expression data , 2014, Appl. Soft Comput..

[19]  Gloria E. Phillips-Wren,et al.  Mining lung cancer patient data to assess healthcare resource utilization , 2008, Expert Syst. Appl..

[20]  L. Ries,et al.  Cancer incidence in four member countries (Cyprus, Egypt, Israel, and Jordan) of the Middle East Cancer Consortium (MECC) compared with US SEER. , 2006 .

[21]  B Rachet,et al.  Cancer survival in Australia, Canada, Denmark, Norway, Sweden, and the UK, 1995–2007 (the International Cancer Benchmarking Partnership): an analysis of population-based cancer registry data , 2011, Lancet.

[22]  C. Mathers,et al.  Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008 , 2010, International journal of cancer.

[23]  Office on Smoking The Health Consequences of Smoking: A Report of the Surgeon General , 2004 .

[24]  R. Courtney,et al.  The Health Consequences of Smoking-50 Years of Progress: A Report of the Surgeon General, 2014Us Department of Health and Human Services Atlanta, GA: Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for: Critique , 2015 .

[25]  Franco Berrino,et al.  Survival for eight major cancers and all cancers combined for European adults diagnosed in 1995-99: results of the EUROCARE-4 study. , 2007, The Lancet. Oncology.

[26]  K. Tickle,et al.  Significant cancer risk factor extraction: An association rule discovery approach , 2008, 2008 11th International Conference on Computer and Information Technology.

[27]  Kurt Straif,et al.  Preventable exposures associated with human cancers. , 2011, Journal of the National Cancer Institute.

[28]  R. Anitha,et al.  Ensemble based optimal classification model for pre-diagnosis of lung cancer , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[29]  Kung-Min Wang,et al.  Modeling and predicting the occurrence of brain metastasis from lung cancer by Bayesian network: A case study of Taiwan , 2014, Comput. Biol. Medicine.

[30]  Hnin Wint Khaing Data mining based fragmentation and prediction of medical data , 2011, 2011 3rd International Conference on Computer Research and Development.

[31]  Neil E. Caporaso,et al.  Cigarette Smoking and Lung Cancer: Modeling Total Exposure and Intensity , 2006, Cancer Epidemiology Biomarkers & Prevention.

[32]  Bin Huang,et al.  Statistical Methods for Population-Based Cancer Survival in Registry Data , 2014 .

[33]  Ying Li,et al.  An Ant Colony Optimization Based Dimension Reduction Method for High-Dimensional Datasets , 2013 .

[34]  R. Zhan,et al.  Smoking and risk of meningioma: a meta-analysis. , 2013, Cancer epidemiology.

[35]  M. Chávez-MacGregor,et al.  Cancer survival in Australia, Canada, Denmark, Norway, Sweden, and the UK, 1995-2007 (the International Cancer Benchmarking Partnership): An analysis of population-based cancer registry data , 2011 .

[36]  M. Plummer,et al.  Global burden of cancers attributable to infections in 2008: a review and synthetic analysis. , 2012, The Lancet. Oncology.

[37]  Martin Krapcho,et al.  SEER Cancer Statistics Review, 1975–2009 (Vintage 2009 Populations) , 2012 .

[38]  K. Jamil,et al.  Occupational and environmental carcinogens in epidemiology of lung cancer in South Indian population , 2010 .

[39]  Yao Liu,et al.  Mining cancer data with discrete particle swarm optimization and rule pruning , 2011, 2011 IEEE International Symposium on IT in Medicine and Education.

[40]  Mohammed Abdul Rasheed,et al.  Classification of lung cancer subtypes by data mining technique , 2014, Proceedings of The 2014 International Conference on Control, Instrumentation, Energy and Communication (CIEC).

[41]  T. Sobue,et al.  Cancer epidemiology in South Asia - past, present and future. , 2010, Asian Pacific journal of cancer prevention : APJCP.

[42]  R. Peto,et al.  Smoking and mortality from tuberculosis and other diseases in India: retrospective study of 43 000 adult male deaths and 35 000 controls , 2003, The Lancet.

[43]  D M Parkin,et al.  16. The fraction of cancer attributable to lifestyle and environmental factors in the UK in 2010 , 2011, British Journal of Cancer.

[44]  D. Parkin 2. Tobacco-attributable cancer burden in the UK in 2010 , 2011, British Journal of Cancer.

[45]  S. N. Deepa,et al.  Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier , 2015, TheScientificWorldJournal.

[46]  Siti Zaiton Mohd Hashim,et al.  Corrigendum to "Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems" [Inform. Sci. 239(2013) 165-190] , 2014, Inf. Sci..

[47]  Divakar Singh,et al.  Classification of Cancer Gene Selection Using Random Forest and Neural Network Based Ensemble Classifier , 2013 .

[48]  Juliet Rani Rajan,et al.  A survey on mining techniques for early lung cancer diagnoses , 2013, 2013 International Conference on Green Computing, Communication and Conservation of Energy (ICGCE).

[49]  Benny Y. M. Fung,et al.  Improving classification performance for heterogeneous cancer gene expression data , 2004, International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004..

[50]  Sonal Jain,et al.  Breast cancer statistics and prediction methodology: a systematic review and analysis. , 2015, Asian Pacific journal of cancer prevention : APJCP.

[51]  Imran Ali,et al.  Cancer Scenario in India with Future Perspectives , 2011 .

[52]  N. Grassly,et al.  Infectious Disease Epidemiology: Theory and Practice KE Nelson, CM Williams, NMH Graham (eds) MD, USA: Aspen Publishers Inc. 2001, pp.748, US$79.00, ISBN: 0-8342-1766-X. , 2001 .

[53]  Thomas A. Runkler,et al.  Fuzzy Clustering by Particle Swarm Optimization , 2006, 2006 IEEE International Conference on Fuzzy Systems.

[54]  J. Ferlay,et al.  Cancer Incidence in Five Continents , 1970, Union Internationale Contre Le Cancer / International Union against Cancer.

[55]  Amir-Masoud Eftekhari-Moghadam,et al.  Knowledge discovery in medicine: Current issue and future trend , 2014, Expert Syst. Appl..

[56]  Siti Zaiton Mohd Hashim,et al.  Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems , 2013, Inf. Sci..

[57]  Takio Kurita,et al.  An evolutionary approach for gene selection and classification of microarray data based on SVM error-bound theories , 2010, Biosyst..

[58]  D. Parkin,et al.  Cancer registration: principles and methods. Analysis of survival. , 1991, IARC scientific publications.

[59]  David Piedra,et al.  Text mining and medicine: usefulness in respiratory diseases. , 2014, Archivos de bronconeumologia.

[60]  Enrique Alba,et al.  Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms , 2007, 2007 IEEE Congress on Evolutionary Computation.

[61]  S. Pai Gutkha banned in Indian states. , 2002, The Lancet. Oncology.

[62]  Majid Ezzati,et al.  Estimates of global mortality attributable to smoking in 2000 , 2003, The Lancet.

[63]  E. Feuer,et al.  SEER Cancer Statistics Review, 1975-2003 , 2006 .

[64]  Divya Tomar,et al.  Clustering of lung cancer data using Foggy K-means , 2013, 2013 International Conference on Recent Trends in Information Technology (ICRTIT).

[65]  Ying-Wooi Wan,et al.  Pathway-based identification of a smoking associated 6-gene signature predictive of lung cancer risk and survival , 2012, Artif. Intell. Medicine.

[66]  G. Thomas,et al.  Effect of tobacco chewing, tobacco smoking and alcohol on all-cause and cancer mortality: a cohort study from Trivandrum, India. , 2010, Cancer epidemiology.

[67]  Dursun Delen,et al.  Predicting the graft survival for heart-lung transplantation patients: An integrated data mining methodology , 2009, Int. J. Medical Informatics.

[68]  K. Thankappan,et al.  Tobacco use & social status in Kerala. , 2007, The Indian journal of medical research.

[69]  Yangyang Li,et al.  A particle swarm optimization based simultaneous learning framework for clustering and classification , 2014, Pattern Recognit..

[70]  F. Chaloupka,et al.  Tobacco Control in Developing Countries , 2001 .