A Neural Network Model for Prognostic Prediction

This thesis explores the potential uses of artificial neural networks for several different types of problems in some of the key areas of modem epidemiological research. Four of the five applications represent the first attempts to use artificial neural networks in this way. Recent weather as a predictor of gastroenteritis in Melbourne Neural network models were trained on different types and combinations of inputs related to social events, recent requests for faecal analysis, recent weather, and day of the year. All inputs were smoothed as 7-day asymmetric moving averages. The outputs were similarly smoothed leads of the faecal analysis requests series, and the aim was to predict analysis request numbers up to seven days ahead. Each new input added to the models' ability to generalise the relationships, evidenced by improving R2 values for the validation data despite 'early stopping' of network training. Adding the day of year input, representing overall seasonal effects, was only as effective as adding one of the recent weather inputs. Models including all the weather inputs accounted for most of the variation in faecal request numbers up to seven days ahead. Recent weather is one of the most important predictors of requests for faecal analysis in metropolitan Melbourne up to at least seven days into the future. Forecasting Gastroenteritis in Melbourne Even cities with state-of-the-art water treatment facilities may still suffer large outbreaks of water-borne gastroenteritis. This study examined the use of artificial neural networks to model time series of requests for faecal analysis in the city of Melbourne, Australia. Retrospective daily counts of requests for faecal microscopy and culture were obtained from Australia's government health insurance scheme. Weather data, public and school holidays were included to create eight multivariate time series with data from July 1, 1996 to June 30, 2000. The series were smoothed using asymmetric seven-day moving averages, and used to train feedforward neural networks by backpropagation of errors. All networks were trained to predict smoothed request numbers one to seven days ahead. Model generalisation and forecasting ability were tested separately. In general, the more inputs included the better the model fit on the training and validation data sets. However, the larger the models the less well they predicted request numbers in the prospective 180-day test set. Smaller models using limited weather inputs gave quite accurate predictions on the prospective test sets. At this city-wide scale artificial neural networks can generalise the relationships between past events and future daily numbers of requests for faecal analysis. The best of the models produced in this study would be very useful in the early detection of water-borne disease outbreaks. Measles in Mozambique Measles is a common childhood illness in Mozambique. This study assessed the potential of artificial neural networks in forecasting weekly cases, which would allow more timely and effective control measures. Models were trained using a ten-year data set of measles reports from the surveillance system. Forecasting ability was tested for two kinds of hold-out test sets: a 15% set from the same time window, and a 15% set from beyond the end of the training window. The models fit the smoothed training data well. Good generalisation was achieved for the same time window on which the models were trained (R > 0.9 up to 8 weeks ahead), but true prospective forecasting was poor. At an appropriate geographic scale, and with suitable pre-processing, neural networks can accurately relate future measles reports to past reports within the same time window. However, these relationships do not necessarily hold into the future within the same time series. The models created would not be useful in practical applications. Extracting prognostic information from cancer registry and health care utilisation data How much prognostic information for survival does a particular cancer registry contain, and what is the best way to extract that information? This study assessed the potential of prognostic models based on a localised database including cancer registry and health care utilisation (rather than more universal but less accurate 'bin models' such as the Tumour-Node-Metastases (TNM) system). Neural networks were used to test for non-linear relationships, to determine whether the more transparent Cox proportional hazards regression technique extracts all available information from the data set. The study used data from the U.S. National Cancer Institute's Surveillance Epidemiology and End Results-Medicare (SEER-Medicare) data set on carcinoma of the colon, including all individuals aged 65 or more diagnosed with nodepositive colon cancer between 1992 and 1996, in an area encompassing approximately 14% of the U.S. population. All 4463 cases had potentially curative resection of the tumour and were followed either to death or two years; of these, 2615 patients were followed to five years. Separate models were made to predict survival to two and five years. Model inputs were parameters available shortly after diagnosis, including use of adjuvant chemotherapy with 5-fluorouracil (5FU)-containing regimens. Cox models were estimated using maximum likelihood in SAS version 8.2. Model parameters estimated with the training sets were used in the test sets to provide a survival curve for each individual, and predicted probabilities of surviving extracted Feed-forward neural networks were trained by back-propagation of errors using NeuroShell2, with one hidden layer and a single output (the network's estimate of survival probability). The two techniques gave similar results. For survival at two and five years respectively the mean percentage correct was 76.2% and 69.3% for the neural networks and 76.8% and 70. 1 % for the Cox models. Areas under receiver operating characteristic (ROC) curves gave similar results for the two methods, with areas around 73% and overlapping confidence intervals. Both the neural network and Cox models were well calibrated. There was close agreement between the different types of models (whether correct or not) for individual patients. The SEER-Medicare database contains considerable prognostic information for survival outcomes. The similar results for the Cox modelling and neural network's suggest that there are no important non-linearities in these data, and the Cox models capture as much prognostic information as exists. The model predictions are well calibrated, so they are of potential use to health facilities in comparing their outcomes. However, incorporation of more clinical and biochemical details in the registry might remove a large proportion of the uncertainty from the prognostic estimates. Artificial neural networks and Job Specific Modules to assess occupational exposure Job Specific Modules (JSMs) were used to collect information for expert retrospective exposure assessment in a community based Non-Hodgkins Lymphoma study in New South Wales, Australia. Using exposure assessment by a hygienist, artificial neural networks were developed to predict overall and intermittent benzene exposure from the Driver module. Even with a small data set (189 drivers) neural networks could assess benzene exposure with an average of 90% accuracy. By appropriate choice of cutoff (decision threshold), the neural networks could reliably reduce the expert's workload by around 60% by identifying negative JSMs. The use of artificial neural networks shows promise in future applications to occupational assessment by job specific modules and expert assessment.

[1]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[2]  Christopher J. S. de Silva,et al.  Entropy maximization networks: an application to breast cancer prognosis , 1996, IEEE Trans. Neural Networks.

[3]  R. Aitken,et al.  Arm morbidity within a trial of mastectomy and either nodal sample with selective radiotherapy or axillary clearance , 1989, The British journal of surgery.

[4]  F. Harrell,et al.  Artificial neural networks improve the accuracy of cancer survival prediction , 1997, Cancer.

[5]  Anthony Howell,et al.  Effects of radiotherapy and surgery in early breast cancer. An overview of the randomized trials. , 1995, The New England journal of medicine.

[6]  W. N. Street,et al.  Computer-derived nuclear features distinguish malignant from benign breast cytology. , 1995, Human pathology.

[7]  C. Carter,et al.  Relation of tumor size, lymph node status, and survival in 24,740 breast cancer cases , 1989, Cancer.

[8]  Eric B. Baum,et al.  Supervised Learning of Probability Distributions by Neural Networks , 1987, NIPS.

[9]  Olvi L. Mangasarian,et al.  Individual and Collective Prognostic Prediction , 1996, ICML 1996.

[10]  P Pouillard [Breast cancer, diagnosis and prognostic parameters]. , 1992, Soins. Gynecologie, obstetrique, puericulture, pediatrie.

[11]  Rich Caruana,et al.  Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.

[12]  E. Gehan A GENERALIZED WILCOXON TEST FOR COMPARING ARBITRARILY SINGLY-CENSORED SAMPLES. , 1965, Biometrika.

[13]  H. Burke,et al.  Artificial neural networks for cancer research: outcome prediction. , 1994, Seminars in surgical oncology.

[14]  D.,et al.  Regression Models and Life-Tables , 2022 .

[15]  datasets,et al.  Breast Cancer Diagnosis , 1967, Nature.

[16]  E. Gehan,et al.  A generalized two-sample Wilcoxon test for doubly censored data. , 1965, Biometrika.

[17]  M De Laurentiis,et al.  A technique for using neural network analysis to perform survival analysis of censored data. , 1994, Cancer letters.

[18]  Paul S. Bradley,et al.  Clustering via Concave Minimization , 1996, NIPS.

[19]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[20]  William Nick Street,et al.  An Inductive Learning Approach to Prognostic Prediction , 1995, ICML.

[21]  Elisa T. Lee,et al.  Statistical Methods for Survival Data Analysis , 1994, IEEE Transactions on Reliability.

[22]  Esther Levin,et al.  Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..

[23]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[24]  W. N. Street,et al.  Computer‐derived nuclear features compared with axillary lymph node status for breast carcinoma prognosis , 1997, Cancer.