Cortically coupled image computing

In the 1970s, researchers at the University of California started to investigate communication between humans and computers using neural signals, which lead to the emergence of brain- computer interfaces (BCIs). In the past 40 years, significant progress has been achieved in ap- plication areas such as neuroprosthetics and rehabilitation. BCIs have been recently applied to media analytics (e.g., image search and information retrieval) as we are surrounded by tremen- dous amounts of media information today. A cortically coupled computer vision (CCCV) sys- tem is a type of BCI that exposes users to high throughput image streams via the rapid serial visual presentation (RSVP) protocol. Media analytics has also been transformed through the enormous advances in artificial intelligence (AI) in recent times. Understanding and presenting the nature of the human-AI relationship will play an important role in our society in the future. This thesis explores two lines of research in the context of traditional BCIs and AI. Firstly, we study and investigate the fundamental processing methods such as feature extraction and clas- sification for CCCV systems. Secondly, we discuss the feasibility of interfacing neural systems with AI technology through CCCV, an area we identify as neuro-AI interfacing. We have made two electroencephalography (EEG) datasets available to the community that support our inves- tigation of these two research directions. These are the neurally augmented image labelling strategies (NAILS) dataset and the neural indices for face perception analysis (NIFPA) dataset, which are introduced in Chapter 2. The first line of research focuses on studying and investigating fundamental processing methods for CCCV. In Chapter 3, we present a review on recent developments in processing methods for CCCV. This review introduces CCCV related components, specifically the RSVP experimental setup, RSVP-EEG phenomena such as the P300 and N170, evaluation metrics, feature extraction and classification. We then provide a detailed study and an analysis on spatial filtering pipelines in Chapter 4, which are the most widely used feature extraction and reduction methods in a CCCV system. In this context, we propose a spatial filtering technique named multiple time window LDA beamformers (MTWLB) and compare it to two other well-known techniques in the literature, namely xDAWN and common spatial patterns (CSP). Importantly, we demonstrate the efficacy of MTWLB for time-course source signal reconstruction compared to existing methods, which we then use as a source signal information extraction method to support a neuro-AI interface. This will be further discussed in this thesis i.e. Chapter 6 and Chapter 7. The latter part of this thesis investigates the feasibility of neuro-AI interfaces. We present two research studies which contribute to this direction. Firstly, we explore the idea of neuro- AI interfaces based on stimulus and neural systems i.e., observation of the effects of stimuli produced by different AI systems on neural signals. We use generative adversarial networks (GANs) to produce image stimuli in this case as GANs are able to produce higher quality images compared to other deep generative models. Chapter 5 provides a review on GAN-variants in terms of loss functions and architectures. In Chapter 6, we design a comprehensive experiment to verify the effects of images produced by different GANs on participants’ EEG responses. In this we propose a biologically-produced metric called Neuroscore for evaluating GAN per- formance. We highlight the consistency between Neuroscore and human perceptual judgment, which is superior to conventional metrics (i.e., Inception Score (IS), Fre chet Inception Distance (FID) and Kernel Maximum Mean Discrepancy (MMD) discussed in this thesis). Secondly, in order to generalize Neuroscore, we explore the use of a neuro-AI interface to help convolutional neural networks (CNNs) predict a Neuroscore with only an image as the input. In this scenario, we feed the reconstructed P300 source signals to the intermediate layer as supervisory informa- tion. We demonstrate that including biological neural information can improve the prediction performance for our proposed CNN models and the predicted Neuroscore is highly correlated with the real Neuroscore (as directly calculated from human neural signals).