Exploring the risk dietary factors for the colorectal cancer

The research target is to explore the key biomarkers of colorectal cancer by studying the impact of dietary factors on colorectal cancer. We first employed statistical methods to preprocess experimental data. Then, relief algorithm is employed to extract key features in the dietary data set. Finally, supporting vector machine (SVM) is used to classify the data set and compute classification accuracy. The results demonstrated that vegetables, seafood, eggs and milk have great impacts on the colorectal cancer. Therefore, we concluded that integrating relief algorithm with SVM model can explore the key biomarkers for colorectal cancer, while the investigation of the interactions among these features needs further research.

[1]  Huan Yang,et al.  A Novel Polymorphism rs1329149 of CYP2E1 and a Known Polymorphism rs671 of ALDH2 of Alcohol Metabolizing Enzymes Are Associated with Colorectal Cancer in a Southwestern Chinese Population , 2009, Cancer Epidemiology, Biomarkers & Prevention.

[2]  Yong Shi,et al.  Prediction of Customer Attrition of Commercial Banks based on SVM Model , 2014, ITQM.

[3]  Amrik S. Sohal,et al.  The longitudinal effects of the ISO 9000 certification process on business performance , 2003, Eur. J. Oper. Res..

[4]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[5]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[6]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[7]  Feng Wen,et al.  Research and Application of Data Mining Feature Selection Based on Relief Algorithm , 2014, J. Softw..

[8]  Rachel L. Thompson,et al.  Recent Evidence for Colorectal Cancer Prevention Through Healthy Food, Nutrition, and Physical Activity: Implications for Recommendations , 2012, Current Nutrition Reports.

[9]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[10]  Tabbye M. Chavous,et al.  Multidimensional Inventory of Black Identity: A Preliminary Investigation of Reliability and Construct Validity , 1997 .

[11]  WestonJason,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002 .

[12]  D. Wan [Epidemiologic trend of and strategies for colorectal cancer]. , 2009, Ai zheng = Aizheng = Chinese journal of cancer.

[13]  M. J. Noruésis,et al.  SPSS-X advanced statistics guide , 1985 .

[14]  M. Dunlop,et al.  Molecular genetic basis of colorectal cancer susceptibility , 1996, The British journal of surgery.