IDENTITY VERIFICATION

This report first provides an review of important concepts in the field of information fusion, followed by a review of important milestones in audio-visual person identification and verification. Several recent adaptive and non-adaptive techniques for reaching the verification decision (i.e., to accept or reject the claimant), based on speech and face information, are then evaluated in clean and noisy audio conditions on a common database; it is shown that in clean conditions most of the non-adaptive approaches provide similar performance and in noisy conditions most exhibit a severe deterioration in performance; it is also shown that current adaptive approaches are either inadequate or utilize restrictive assumptions. A new category of classifiers is then introduced, where the decision boundary is fixed but constructed to take into account how the distributions of opinions are likely to change due to noisy conditions; compared to a previously proposed adaptive approach, the proposed classifiers do not make a direct assumption about the type of noise that causes the mismatch between training and testing conditions. This report is an extended and revised version of [59].

[1]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[2]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Robert R. Tenney,et al.  Detection with distributed sensors , 1980 .

[5]  Nils R. Sandell,et al.  Strategies for Distributed Decisionmaking , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  David Casasent,et al.  Multisensor Image Registration: Experimental Verification , 1981, Optics & Photonics.

[7]  L. F. Pau,et al.  FUSION OF MULTISENSOR DATA IN PATTERN RECOGNITION , 1982 .

[8]  D. F. Burrows,et al.  Hardware and architecture design of VLSI systems , 1985 .

[9]  Aaron E. Rosenberg,et al.  On the use of instantaneous and transitional spectral information in speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[11]  J F Osborn,et al.  Significance tests , 1989, British Dental Journal.

[12]  James Llinas,et al.  Multisensor Data Fusion , 1990 .

[13]  Sara H. Basson,et al.  NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[14]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[15]  Douglas A. Reynolds,et al.  A Gaussian mixture modeling approach to text-independent speaker identification , 1992 .

[16]  John S. D. Mason,et al.  A voice activity detector based on cepstral analysis , 1993, EUROSPEECH.

[17]  Joseph Picone,et al.  Signal modeling techniques in speech recognition , 1993, Proc. IEEE.

[18]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[19]  Sargur N. Srihari,et al.  Decision Combination in Multiple Classifier Systems , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Douglas A. Reynolds,et al.  Experimental evaluation of features for robust speaker identification , 1994, IEEE Trans. Speech Audio Process..

[21]  Chin-Hui Lee,et al.  Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[22]  Ren C. Luo,et al.  Multisensor integration and fusion for intelligent machines and systems , 1995 .

[23]  Roberto Brunelli,et al.  Person identification using multiple cues , 1995, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ali Adjoudani,et al.  Audio-visual speech recognition compared across two architectures , 1995, EUROSPEECH.

[25]  Tomaso Poggio,et al.  Automatic person recognition by acoustic and geometric features , 1995 .

[26]  Alan C. Bovik,et al.  Computer lipreading for improved accuracy in automatic speech recognition , 1996, IEEE Trans. Speech Audio Process..

[27]  Horst Bunke,et al.  Combination of Classifiers on the Decision Level for Face Recognition , 1996 .

[28]  Gérard Chollet,et al.  Combining methods to improve speaker verification decision , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[29]  Pramod K. Varshney,et al.  Distributed Detection and Data Fusion , 1996 .

[30]  Jerry D. Cavin Advances in distributed sensor technology , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[31]  Juergen Luettin,et al.  Integrating acoustic and labial information for speaker identification and verification , 1997, EUROSPEECH.

[32]  Juergen Luettin,et al.  Visual Speech and Speaker Recognition , 1997 .

[33]  John D. Woodward,et al.  Biometrics: privacy's foe or privacy's friend? , 1997, Proc. IEEE.

[34]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[35]  Vlasta Radová,et al.  An approach to speaker identification using multiple classifiers , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36]  Jiri Matas,et al.  Combining evidence in personal identity verification systems , 1997, Pattern Recognit. Lett..

[37]  Bernhard Fröba,et al.  SESAM: A Biometric Person Identification System Using Sensor Fusion , 1997, AVBPA.

[38]  Juergen Luettin,et al.  Acoustic-labial speaker verification , 1997, Pattern Recognit. Lett..

[39]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[41]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[42]  Anil K. Jain,et al.  Integrating Faces and Fingerprints for Personal Identification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Gerasimos Potamianos,et al.  Discriminative training of HMM stream exponents for audio-visual speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[44]  Volker Roth,et al.  Nonlinear Discriminant Analysis Using Kernel Functions , 1999, NIPS.

[45]  Marc Acheroy,et al.  A Contribution to Multi-Modal Identity Verification Using D ecision Fusion , 1999 .

[46]  E. Mayoraz,et al.  Fusion of face and speech data for person identity verification , 1999, IEEE Trans. Neural Networks.

[47]  Sridha Sridharan,et al.  Robust speaker verification via fusion of speech and lip modalities , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[48]  Kuldip K. Paliwal,et al.  USE OF VOICING AND PITCH INFORMATION FOR SPEAKER RECOGNITION , 2000 .

[49]  Mübeccel Demirekler,et al.  An information theoretic framework for weight estimation in the combination of probabilistic classifiers for speaker identification , 2000, Speech Commun..

[50]  Sridha Sridharan,et al.  The use of temporal speech and lip information for multi-modal speaker identification via multi-stream HMMs , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[51]  Cheng-Shang Chang Calculus , 2020, Bicycle or Unicycle?.

[52]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[53]  Douglas A. Reynolds,et al.  The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective , 2000, Speech Commun..

[54]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[55]  Tim Wark,et al.  Multi-modal speech processing for automatic speaker recognition , 2001 .

[56]  Luís A. Alexandre,et al.  On combining classifiers using sum and product rules , 2001, Pattern Recognit. Lett..

[57]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[58]  Mübeccel Demirekler,et al.  Comparison of different objective functions for optimal linear combination of classifiers for speaker identification , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[59]  Wendy Atkins A testing time for face recognition technology , 2001 .

[60]  J. van Leeuwen,et al.  Audio- and Video-Based Biometric Person Authentication , 2001, Lecture Notes in Computer Science.

[61]  Chin-Chuan Han,et al.  Why recognition in a statistics-based face recognition system should be based on the pure face portion: a probabilistic decision-based proof , 2001, Pattern Recognit..

[62]  Ioannis Pitas,et al.  Recent advances in biometric person authentication , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[63]  Michael Grüninger,et al.  Introduction , 2002, CACM.

[64]  Kuldip K. Paliwal,et al.  Information Fusion and Person Verification Using Speech & Face Information , 2002 .

[65]  Nalini K. Ratha,et al.  Biometric perils and patches , 2002, Pattern Recognit..

[66]  Conrad Sanderson,et al.  The VidTIMIT Database , 2002 .

[67]  Samy Bengio,et al.  Multimodal Authentication Using Asynchronous HMMs , 2003, AVBPA.

[68]  Samy Bengio,et al.  Non-Linear Variance Reduction Techniques in Biometric Authentication , 2003 .

[69]  Anil K. Jain,et al.  Hiding Biometric Data , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[70]  Kuldip K. Paliwal,et al.  Fast features for face authentication under illumination direction changes , 2003, Pattern Recognit. Lett..

[71]  Kuldip K. Paliwal,et al.  Noise compensation in a person verification system using face and multiple speech feature , 2003, Pattern Recognit..

[72]  Ara V. Nefian,et al.  A Bayesian Approach to Audio-Visual Speaker Identification , 2003, AVBPA.

[73]  Arun Ross,et al.  Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[74]  Samy Bengio,et al.  Face verification using adapted generative models , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[75]  Samy Bengio,et al.  STATISTICAL TRANSFORMATION TECHNIQUES FOR FACE VERIFICATION USING FACES ROTATED IN DEPTH , 2004 .