Discovering and Validating Breast Cancer Treatment Correlations Using an Associative Memory Model and Statistical Methods

Breast cancer care involves a number of clinical considerations, such as relevant patient characteristics, including age, hormone-receptor status and cancer stage, and choice among several interventions, like surgery, radiation therapy, and administered drugs. Discovering these relationships in real word care is a challenging problem due to the fragmentation of relevant data among multiple information systems and the absence of a common analysis framework that correlates all the factors across the systems. In this paper, we present an associative memory model that can integrate heterogeneous data from electronic medical records and a tumor registry. We then run efficient queries on the aggregated data model to obtain supporting evidences in the form of patient counts with all pair wise combinations of characteristic and intervention factors. Multiple statistical hypothesis testing protocols are finally applied on the supporting evidences to discover the important correlations among the factor categories. The results show significant correlations between patient age group and hormone-receptor status with both administered procedures and drugs.