Probabilistic models in the biomedical sciences.

Probabilistic, and in particular Bayesian, methods for modelling data are becoming increasingly sophisticated. This has been fuelled by the demand to analyse the enormous wealth of data being produced by the biomedical sciences. In this thesis we present a variety of unsupervised generative probabilistic models loosely based around mixtures of distributions. The motivation behind using these models is that the mixture reflects aspects of a biomedical process which has a number of contributing factors. We analyse gene expression data from microarray, sequence motif data and radiological data. We attempt to model the interactions between motif data and gene expression for yeast, and we perform in depth analysis of gene expression data for four breast cancer datasets. The radiological data comes from computed tomography scans and radiologist reports. We model the interaction between image data from scans and textual data from reports for a number of lung diseases. A common theme throughout this thesis is data fusion: this can be the joint modelling of two separate datasets, comparison of equivalent data sets from independent sources or simply the incorporation of external information into the model.