Non-exhaustive Learning for Bacteria Detection

Technologies for rapid detection and classification of bacterial pathogens are crucial for securing the food supply. A light-scattering sensor recently developed for real-time detection and identification of colonies of multiple pathogens has shown great promise for distinguishing bacteria cultures at the genus and species level for Listeria, Staphylococcus, Salmonella, Vibrio, and Escherichia. Unlike traditional testing methods, this new technology does not require a labeling reagent or biochemical processing. The classification approach currently used with this technology relies on supervised learning. For an accurate detection and classification of bacterial pathogens, the training library used to train the classifier should consist of samples of all possible forms of the pathogens. Construction of such a training library is impractical if not impossible due to the high mutation rate that characterizes some of the infectious agents. In this study we propose a Bayesian approach to advance this sensor technology to allow for the detection of new classes/subclasses of bacteria, which do not exist in the training library. Learning with a nonexhaustive training library is an ill-defined problem. We assume Gaussian distributions for bacteria subclasses and implement a maximum likelihood classifier. A pair of conjugate priors based on Wishart distribution is defined and the covariance matrices are estimated by the posterior mean. A new sample is classified into one of the existing set of classes if the maximum of the likelihoods is above a designated threshold. If not, the sample is considered a novelty, i.e. a sample of a potentially new class. We compare the proposed approach with a benchmark support estimation technique as well as a simulated Bayesian modelling approach recently proposed.