Abstract In this paper we describe a technique of classifier combi-nation used in a human identification system. The systemintegrates all available features from multi-modal sourceswithin a Bayesian framework. The framework allows repre-senting a class of popular classifier combination rules andmethods within a single formalism. It relies on a “per-class” measure of confidence derived from performance ofeach classifier on training data that is shown to improveperformance on a synthetic data set. The method is es-pecially relevant in autonomous surveillance setting wherevarying time scales and missing features are a commonoccurrence. We show an application of this technique tothe real-world surveillance database of video and audiorecordings of people collected over several weeks in the of-fice setting. 1 Introduction and Motivation In problems of biometric verification and identification alarge role is played by the multi-modal aspect of the obser-vation. A person can be identified by a number of features,including face, height, body shape, gait, voice etc. How-ever, the features are not equal in their overall contributionto identifying a person. For instance, modern algorithms forface classification (e.g. [11]) and speaker identification (e.g.[6]) can attain high recognition rates, provided that the datais well formed and is relatively free of variations and noise,while other features, such as, gait (e.g. [1]) or body shape,are only mildly discriminative.Even though one can achieve high recognition rateswhen classifying some of these features, in reality they areobserved only relatively rarely - in a surveillance video se-quence the face image can only be used if the person is closeenough and is facing the camera, or a person’s voice whenthe person is speaking. In contrast, there is a plentiful sup-ply of the less discriminative features. This situation is il-lustrated on an example of one of our video sequences infigure 1.
Larry S. Davis,et al.
Stride and cadence as a biometric in automatic person identification and verification
Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.
Jeho Nam,et al.
Speaker identification and video analysis for hierarchical video shot classification
Proceedings of International Conference on Image Processing.
Jiri Matas,et al.
Combining evidence in personal identity verification systems
Pattern Recognit. Lett..
Arun Ross,et al.
Information fusion in biometrics
Pattern Recognit. Lett..
Azriel Rosenfeld,et al.
Face recognition: A literature survey
Josef Kittler,et al.
Combining multiple classifiers by averaging or by multiplying?
Pattern Recognit..
Jiri Matas,et al.
Combining Evidence in Multimodal Personal Identity Recognition Systems
Thomas Serre,et al.
Categorization by Learning and Combining Object Parts
Robert P. W. Duin,et al.
A Discussion on the Classifier Projection Space for Classifier Combining
Multiple Classifier Systems.
Jeff A. Bilmes,et al.
Directed graphical models of classifier combination: application to phone recognition