Face-Voice Authentication Based on 3D Face Models

In this paper we propose fusion of shape and texture information from 3D face models of persons with the acoustic features extracted from spoken utterances, to improve the performance against imposter and replay attacks. Experiments conducted on two multimodal speaking face corpora, VidTIMIT and AVOZES, allowed less than 2 % EERs to be achieved for imposter attacks, and less than 1% for type-1 replay attacks for multimodal feature fusion of acoustic, shape and texture features. For type-2 replay attacks, more difficult type of spoof attacks, less than 7% EER was achieved.