Integrating Evidences ofIndependently Developed FaceandSpeaker Recognition Systems byUsing Discrete Probability Density Function

Userrecognition isoneofthemostfundamental functionalities forintelligent service robots. However, inrobot applications, theconditions arefarseverer compared tothe traditional biometric security systems. Therobots should be abletorecognize usersnon-intrusively, whichconfines the available biometric features tofaceandvoice. Also, therobots areexpected torecognize usersfromrelatively afar, which inevitably deteriorates theaccuracyofeachrecognition module. Inthis paper, wetried toimprove theoverall accuracy by integrating theevidences issued byindependently developed faceand speakerrecognition modules. Eachrecognition moduleexhibits different statistical characteristics in representing itsconfidence oftherecognition. Therefore, itis essential totransform theevidences toanormalized formto integrate theresults. Thispaperintroduces a novelapproachtointegrate mutually independent multiple evidences to achieve an improved performance. Typical approach tothis problem isto modelthestatistical characteristics oftheevidences by well-known parametric formsuchas Gaussian. Using Mahalanobis distance isa goodexample. However,the characteristics oftheevidences oftendo notfitintothe parametric models, whichresults inperformance degradation. Toovercome this problem, weadopted adiscrete PDFthatcan modelthestatistical characteristics asitis. Toconfirm thevalidity oftheproposed method, weuseda multi-modal database thatconsists of10registered usersand 550probedata. Eachprobedatacontains faceimageandvoice signal. Faceandspeaker recognition modules areapplied to generate respective evidences. Theexperiment showedan improvement of 11.27%in accuracycomparedto the individual recognizers, whichis2.72%betterthanthe traditional Mahalanobis distance approach.

[1]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..