Combining brain computer interfaces with vision for object categorization

Human-aided computing proposes using information measured directly from the human brain in order to perform useful tasks. In this paper, we extend this idea by fusing computer vision-based processing and processing done by the human brain in order to build more effective object categorization systems. Specifically, we use an electroencephalograph (EEG) device to measure the subconscious cognitive processing that occurs in the brain as users see images, even when they are not trying to explicitly classify them. We present a novel framework that combines a discriminative visual category recognition system based on the pyramid match kernel (PMK) with information derived from EEG measurements as users view images. We propose a fast convex kernel alignment algorithm to effectively combine the two sources of information. Our approach is validated with experiments using real-world data, where we show significant gains in classification accuracy. We analyze the properties of this information fusion method by examining the relative contributions of the two modalities, the errors arising from each source, and the stability of the combination in repeated experiments.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, CVPR Workshops.

[4]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Eric Horvitz,et al.  Layered representations for learning and inferring office activity from multiple sensory channels , 2004, Comput. Vis. Image Underst..

[7]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[8]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  John C. Platt,et al.  Learning a Gaussian Process Prior for Automatically Generating Music Playlists , 2001, NIPS.

[10]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[12]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  H. Jasper,et al.  The ten-twenty electrode system of the International Federation. The International Federation of Clinical Neurophysiology. , 1999, Electroencephalography and clinical neurophysiology. Supplement.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  K. Grill-Spector The neural basis of object perception , 2003, Current Opinion in Neurobiology.

[17]  Eric Horvitz,et al.  Bayesian Modality Fusion: Probabilistic Integration of Multiple Vision Algorithms for Head Tracking , 1999 .

[18]  Bruce J. Fisch,et al.  Fisch and Spehlmann's Eeg Primer: Basic Principles of Digital and Analog Eeg , 1999 .

[19]  Desney S. Tan,et al.  Human-aided computing: utilizing implicit human processing to classify images , 2008, CHI.

[21]  E. Donchin,et al.  Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. , 1988, Electroencephalography and clinical neurophysiology.

[22]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[23]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Trevor Darrell,et al.  Active Learning with Gaussian Processes for Object Categorization , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[26]  K. Grill-Spector The neural basis of object perception , 2003, Current Opinion in Neurobiology.

[27]  Bruno A Olshausen,et al.  The earliest EEG signatures of object recognition in a cued-target task are postsensory. , 2005, Journal of vision.

[28]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[29]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[30]  Cordelia Schmid,et al.  Indexing Based on Scale Invariant Interest Points , 2001, ICCV.

[31]  David J. Miller,et al.  Critic-driven ensemble classification , 1999, IEEE Trans. Signal Process..

[32]  Trevor Darrell,et al.  Approximate Correspondences in High Dimensions , 2006, NIPS.

[33]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  P. Sajda,et al.  Cortically coupled computer vision for rapid image search , 2006, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[35]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[36]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[37]  Manuel Blum,et al.  Peekaboom: a game for locating objects in images , 2006, CHI.

[38]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.