Beyond Novelty Detection: Incongruent Events, When General and Specific Classifiers Disagree

Unexpected stimuli are a challenge to any machine learning algorithm. Here, we identify distinct types of unexpected events when general-level and specific-level classifiers give conflicting predictions. We define a formal framework for the representation and processing of incongruent events: Starting from the notion of label hierarchy, we show how partial order on labels can be deduced from such hierarchies. For each event, we compute its probability in different ways, based on adjacent levels in the label hierarchy. An incongruent event is an event where the probability computed based on some more specific level is much smaller than the probability computed based on some more general level, leading to conflicting predictions. Algorithms are derived to detect incongruent events from different types of hierarchies, different applications, and a variety of data types. We present promising results for the detection of novel visual and audio objects, and new patterns of motion in video. We also discuss the detection of Out-Of-Vocabulary words in speech recognition, and the detection of incongruent events in a multimodal audiovisual scenario.

[1]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[2]  Carla Teixeira Lopes,et al.  TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .

[3]  Dit-Yan Yeung,et al.  Parzen-window network intrusion detectors , 2002, Object recognition supported by user interaction for service robots.

[4]  Pietro Perona,et al.  Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition , 2007, International Journal of Computer Vision.

[5]  Jörn Anemüller,et al.  Incongruence Detection in Audio-Visual Processing , 2012, Detection and Identification of Rare Audiovisual Cues.

[6]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Hynek Hermansky,et al.  Detection of out-of-vocabulary words in posterior based ASR , 2007, INTERSPEECH.

[8]  Bernt Schiele,et al.  Robust Object Detection with Interleaved Categorization and Segmentation , 2008, International Journal of Computer Vision.

[9]  I. Nelken,et al.  Multiple Time Scales of Adaptation in Auditory Cortex Neurons , 2004, The Journal of Neuroscience.

[10]  Henning Scheich,et al.  Auditory Cortical Activity after Intracortical Microstimulation and Its Role for Sensory Processing and Learning , 2009, The Journal of Neuroscience.

[11]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[12]  Daphna Weinshall,et al.  Exploiting Object Hierarchy: Combining Models from Different Category Levels , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  H Scheich,et al.  Orderly cortical representation of vowels based on formant interaction. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Daphna Weinshall,et al.  Subordinate class recognition using relational object models , 2006, NIPS.

[15]  Hynek Hermansky,et al.  Posterior-based out of vocabulary word detection in telephone speech , 2009, INTERSPEECH.

[16]  城所 良明,et al.  The Salk Institute for Biological Studies(話題) , 1975 .

[17]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[18]  Timothy F. Cootes,et al.  A unified approach to coding and interpreting face images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[19]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[20]  Hong Yan,et al.  Comparison of face verification results on the XM2VTFS database , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[21]  Ingemar J. Cox,et al.  IEEE Signal Processing Society , 2022, IEEE Journal of Selected Topics in Signal Processing.

[22]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[23]  Sameer Singh,et al.  Novelty detection: a review - part 2: : neural network based approaches , 2003, Signal Process..

[24]  Jörn Anemüller,et al.  Detecting novel objects in acoustic scenes through classifier incongruence , 2010, INTERSPEECH.

[25]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Cordelia Schmid,et al.  Constructing Category Hierarchies for Visual Recognition , 2008, ECCV.

[27]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[28]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[29]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  M. Gluck,et al.  A connectionist model of septohippocampal dynamics during conditioning: closing the loop. , 2002, Behavioral neuroscience.

[31]  G. Berns,et al.  Brain regions responsive to novelty in the absence of awareness. , 1997, Science.

[32]  Daphna Weinshall,et al.  Efficient Learning of Relational Object Class Models , 2005, ICCV.

[33]  Cordelia Schmid,et al.  Semantic Hierarchies for Visual Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Michael I. Jordan,et al.  Robust Novelty Detection with Single-Class MPM , 2002, NIPS.

[35]  Motorcycles Faces Guitars Subordinate class recognition using relational object models , 2006 .

[36]  G. E. Peterson,et al.  Control Methods Used in a Study of the Vowels , 1951 .

[37]  W. Freeman,et al.  Change in pattern of ongoing cortical activity with auditory category learning , 2001, Nature.

[38]  Hynek Hermansky,et al.  Combination of strongly and weakly constrained recognizers for reliable detection of OOVS , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[39]  Jörn Anemüller,et al.  Detection of speech embedded in real acoustic background based on amplitude modulation spectrogram features , 2008, INTERSPEECH.

[40]  Daphna Weinshall,et al.  Efficient Learning of Relational Object Class Models , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[41]  Gary Bradski,et al.  Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[42]  Luc Van Gool,et al.  Tracker trees for unusual event detection , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[43]  J. B. Hampshire,et al.  Real-time object classification and novelty detection for collaborative video surveillance , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[44]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..