Acquisition of a Large Database for Biometric Identity Verification

In this paper we describe the acquisition of a large multi-modal database intended for training and testing of multi-modal verification systems. When completed, the XM2FDB database will contain recordings of about 300 hundred subjects taken over a period of four months. The use of biometric measurements in security applications is becoming common to a level where a dedicated journal [1] monitors the developments in the area. Extremely reliable methods of biometric personal identification exist, eg. fingerprint analysis, retinal or iris scans. But most of these methods are considered unacceptable by users in all but high-security scenarios. Personal identification system based on analysis of speech, frontal or profile images of face are non-intrusive and therefore user-friendly. Moreover, personal identity can by often ascertained without client’s assistance. However, the speech and image-based systems are less robust to imposter attack, especially if the imposter possesses information about a client, eg. a photograph or a recording of client’s speech. Multi-modal personal verification is one of the most promising approaches to user-friendly (hence acceptable) highly secure personal verification systems [2]. Recognition and verification system need training; the larger the training set, the better the performance achieved [3]. The volume of data required for training a multi-modal system based on analysis of video and audio signals is in the order of TBytes (1000 GBytes); technology allowing manipulation and effective use of such amounts of data has only recently become available in the form of digital video. In acquiring the database over three-hundred volunteers from the University of Surrey visited a recording studio four times at approximately one month intervals. On each visit (session) two recordings (shots) were made. The first shot consisted of speech. The subject, whom a clip-on microphone had been attached, was asked to sit in chair. He/she was then asked to read three sentences which were written on a board positioned just below the camera. The subjects were asked to read at their normal pace, to pause briefly at the end of each sentence and to read through the three sentences twice. The three sentences remained the same throughout all four recording sessions and were