Digital audio forensics: a first practical evaluation on microphone and environment classification

In this paper a first approach for digital media forensics is presented to determine the used microphones and the environments of recorded digital audio samples by using known audio steganalysis features. Our first evaluation is based on a limited exemplary test set of 10 different audio reference signals recorded as mono audio data by four microphones in 10 different rooms with 44.1 kHz sampling rate and 16 bit quantisation. Note that, of course, a generalisation of the results cannot be achieved. Motivated by the syntactical and semantical analysis of information and in particular by known audio steganalysis approaches, a first set of specific features are selected for classification to evaluate, whether this first feature set can support correct classifications. The idea was mainly driven by the existing steganalysis features and the question of applicability within a first and limited test set. In the tests presented in this paper, an inter-device analysis with different device characteristics is performed while intra-device evaluations (identical microphone models of the same manufacturer) are not considered. For classification the data mining tool WEKA with K-means as a clustering and Naive Bayes as a classification technique are applied with the goal to evaluate their classification in regard to the classification accuracy on known audio steganalysis features. Our results show, that for our test set, the used classification techniques and selected steganalysis features, microphones can be better classified than environments. These first tests show promising results but of course are based on a limited test and training set as well a specific test set generation. Therefore additional and enhanced features with different test set generation strategies are necessary to generalise the findings.

[1]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[2]  Miroslav Goljan,et al.  Digital camera identification from sensor pattern noise , 2006, IEEE Transactions on Information Forensics and Security.

[3]  Jana Dittmann,et al.  Steganography and steganalysis in voice-over IP scenarios: operational aspects and first experiences with a new steganalysis tool set , 2005, IS&T/SPIE Electronic Imaging.

[4]  Jana Dittmann,et al.  Verifier-tuple for audio-forensic to determine speaker environment , 2005, MM&Sec '05.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Jana Dittmann,et al.  Sensometrics: identifying pen digitizers by statistical multimedia signal processing , 2007, Electronic Imaging.

[7]  Heiko Timm,et al.  Probalistic Networks and Fuzzy Clustering as Generalizations of Naive Bayes Classifiers , 2001 .

[8]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Andrew W. Moore,et al.  K-means and Hierarchical Clustering , 2004 .

[11]  Irving John Good,et al.  The Estimation of Probabilities: An Essay on Modern Bayesian Methods , 1965 .

[12]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[13]  S. Geneva,et al.  Sound Quality Assessment Material: Recordings for Subjective Tests , 1988 .

[14]  Jan Lukás,et al.  Determining digital image origin using sensor imperfections , 2005, IS&T/SPIE Electronic Imaging.

[15]  Jan P. Allebach,et al.  Printer Forensics Using SVM Techniques , 2005, NIP & Digital Fabrication Conference.

[16]  Jana Dittmann,et al.  Mel-cepstrum-based steganalysis for VoIP steganography , 2007, Electronic Imaging.

[17]  Jan P. Allebach,et al.  A survey of forensic characterization methods for physical devices , 2006, Digit. Investig..

[18]  Jana Dittmann,et al.  Transparency benchmarking on audio watermarks and steganography , 2006, Electronic Imaging.

[19]  Jan P. Allebach,et al.  Printer identification based on graylevel co-occurrence features for security and forensic applications , 2005, IS&T/SPIE Electronic Imaging.

[20]  Ian Witten,et al.  Data Mining , 2000 .