We present an identification system based on the naturally occurring inhomogeneities of the surface of paper. We investigate the scaling of its performance for verification and identification through a general and rigorous framework and present a random coding argument that links biometric identification to communication through a noisy channel. We measure the effective communication rate and information density for various configurations of the system. INTRODUCTION The possibility of authenticating products or documents by matching difficult-to-duplicate, inhomogeneous or “random” structure to an associated description of the genuine article has been investigated in various contexts [1] [2]. Escher Labs’ FiberFingerprint technology uses the naturally occurring irregularities of a substrate as a means to discriminate between various documents or objects. Figure 1. FiberFingerprint verifier for mail piece The practical motivation for this technology is security. Compared to a barcode, digital watermark, or other embedded serial number, the identity of a FiberFingerprinted object is difficult to forge, given the length scale and threedimensional aspects of the physical properties being sampled. This paper considers the use of such random inhomogeneities for item identification, rather than simple authentication. A database of samples of the random characteristic is maintained. When an item to be identified is submitted, its random characteristic is measured and compared to the database to find the best match. We investigate the scaling of the performance of such an identification system. In [3], one of us performed a related study using randomly assigned, noiseless IDs. In the present context, we again are using randomly assigned identifiers, but have dropped the assumption of noiselessness: each time a particular object is sensed, somewhat different data is collected because of noise. Verification versus identification From a general point of view, the system’s main function is to compare fingerprints and to assess the degree to which they match. In the context of the verification problem, the system will typically be provided with an observation of an object, and a claimed identity for that object. The task of the system is to compare two fingerprints. The first is recalled from a database, based on the claimed identity, and the second is extracted from the observation of the presented object. The system is expected to assess whether a particular match is “good enough”. In the context of the identification problem, the task of system is to infer the identity of the object based on its fingerprint. Without any particular claim concerning the object’s identity, it is natural to assume that the object’s fingerprint has to be matched against a possibly large number of fingerprints that may be stored in a database in order to assess “the best match”. Hence, while the two systems may use the same numerical criterion to evaluate the match between to fingerprints (e.g. error rate, distance, correlation, etc.), their ultimate decision rules will differ. Salient performance characteristics should reflect these differences. Motivations and overview In a careful attempt to quantify the performance of our FiberFingerprint approach, we have investigated means to measure salient performance characteristics. Along with the actual performance data, these means are the main object of this document. Because of our approach’s obvious similarities with biometrics and auto-identification, we believe that this performance analysis may be relevant outside the specific context of our FiberFingerprints. In what follows, we will first describe the fundamentals of the FiberFingerprint technology. We will present salient performance characteristics in the context of verification and identification. We will describe our testing environment and propose a means to extrapolate the performance statistics we seek from our observations. Lastly, we will present an analogy between identification systems and communication systems that provides a simple framework for understanding our performance results. FIBERFINGERPRINT TECHNOLOGY System overview The system uses registration marks to identify the area of the medium that should be analyzed. These registration marks typically consist of a few small dots, spanning a total surface area of typically less that 25 mm. The imager consists of a consumer-grade video module and lens, housed along with the appropriate lighting apparatus. The imager provides a grayscale capture of the medium’s texture to the software responsible for the FiberFingerprint analysis. Figure 2. Imager and objects (tokens) used for this study The FiberFingerprints reported on here are derived from a one-dimensional signal, the Fiber Signal, which is extracted from the two-dimensional capture along a series of linear segments. These linear segments constitute the Signal Path, and they are an important part of the system’s configuration. The detection of the registration mark provides all the relevant translation, rotation and scaling information that is needed in order to overlay the Signal Path with the captured image of the substrate texture. Running along the Signal Path, a raw signal is extracted by averaging pixel values along the transverse direction of the signal path. The spatial sampling frequency that is used at this stage is also a part of the system’s configuration. Figure 3. Sample capture from the imager; overlaid lines connecting the registration dots (blue) and Signal Path (red) The resulting one-dimensional raw signal is subsequently high-passed. From the perspective of the original image, this is a high-pass filtering stage in the longitudinal direction of the Signal Path. The resulting filtered signal is finally re-sampled and normalized in order to lead to the Fiber Signal. The FiberFingerprint is a quantized and optionally formatted version of the Fiber Signal. Matching criterion As a means to score the match between two Fiber Signals and in the absence of any prior information concerning the probability distributions of the data, computing correlation coefficients naturally comes to mind. As we further wanted to express this criterion in terms of an error rate, we chose the following measure.
[1]
Joshua R. Smith.
Distributing Identity
,
1999
.
[2]
Anil K. Jain,et al.
FVC2000: Fingerprint Verification Competition
,
2002,
IEEE Trans. Pattern Anal. Mach. Intell..
[3]
John G. Proakis,et al.
Probability, random variables and stochastic processes
,
1985,
IEEE Trans. Acoust. Speech Signal Process..
[4]
Andrew V. Sutherland,et al.
Microstructure Based Indicia
,
2003
.
[5]
James L. Wayman,et al.
Error rate equations for the general biometric system
,
1999,
IEEE Robotics Autom. Mag..