Big data approach in the field of gastric and colorectal cancer research

Big data is characterized by three attributes: volume, variety,, and velocity. In healthcare setting, big data refers to vast dataset that is electronically stored and managed in an automated manner and has the potential to enhance human health and healthcare system. In this review, gastric cancer (GC) and postcolonoscopy colorectal cancer (PCCRC) will be used to illustrate application of big data approach in the field of gastrointestinal cancer research. Helicobacter pylori (HP) eradication only reduces GC risk by 46% due to preexisting precancerous lesions. Apart from endoscopy surveillance, identifying medications that modify GC risk is another strategy. Population-based cohort studies showed that long-term use of proton pump inhibitors (PPIs) associated with higher GC risk after HP eradication, while aspirin and statins associated with lower risk. While diabetes mellitus conferred 73% higher GC risk, metformin use associated with 51% lower risk, effect of which was independent of glycemic control. Nonetheless, nonsteroidal anti-inflammatory drugs (NA-NSAIDs) are not associated with lower GC risk. CRC can still occur after initial colonoscopy in which no cancer was detected (i.e. PCCRC). Between 2005 and 2013, the rate of interval-type PCCRC-3y (defined as CRC diagnosed between 6 and 36 months of index colonoscopy which was negative for CRC) was 7.9% in Hong Kong, with >80% being distal cancers and higher cancer-specific mortality compared with detected CRC. Certain clinical and endoscopy-related factors were associated with PCCRC-3 risk. Medications shown to have chemopreventive effects on PCCRC include statins, NA-NSAIDs, and angiotensin-converting enzyme inhibitors/angiotensin receptor blockers.