Machine learning techniques in early screening for gastric and oesophageal cancer

A database on 2692 dyspeptic patients over the age of 40 was established, consisting of 73 epidemiological and clinical variables. A tree-based machine learning algorithm (PREDICTOR) was applied to this database, in order to attempt to find rules which would classify patients into 2 groups, i.e., those suffering from gastric or oesophageal cancer, and the remainder. The results were encouraging. The cross-validated classification performance figure showed that by classifying 61.3% of the patients as high risk, a sensitivity of 94.9% and a specificity of 39.8% could be achieved. It is planned to construct an expert system based on the rules produced by the machine learning algorithm, in order to provide preliminary screening for cancer in dyspeptic patients.