Self-identifying sensor data

Public-use sensor datasets are a useful scientific resource with the unfortunate feature that their provenance is easily disconnected from their content. To address this we introduce a technique to directly associate provenance information with sensor datasets. Our technique is similar to traditional watermarking but is intended for application to unstructured datasets. Our approach is potentially imperceptible given sufficient margins of error in datasets, and is robust to a number of benign but likely transformations including truncation, rounding, bit-flipping, sampling, and reordering. We provide algorithms for both one-bit and blind mark checking. Our algorithms are probabilistic in nature and are characterized by a combinatorial analysis.

[1]  John S. Heidemann,et al.  Provenance in Sensornet Republishing , 2008, IPAW.

[2]  Wang Chiew Tan,et al.  An annotation management system for relational databases , 2004, The VLDB Journal.

[3]  Stéphane Bressan,et al.  Source Attribution for Querying Against Semi-structured Documents , 1998, Workshop on Web Information and Data Management.

[4]  Fereidoon Sadri CIKM'98 First Workshop on Web Information and Data Management (WIDM'98), Bathesda, Maryland, USA, November 6, 1998 , 1998 .

[5]  Ulf Lindqvist,et al.  VEIL: A System for Certifying Video Provenance , 2007, Ninth IEEE International Symposium on Multimedia (ISM 2007).

[6]  Harold I. Jacobson THE MAXIMUM VARIANCE OF RESTRICTED UNIMODAL DISTRIBUTIONS , 1969 .

[7]  Jennifer Widom,et al.  Trio: A System for Integrated Management of Data, Accuracy, and Lineage , 2004, CIDR.

[8]  Lynn Yarmey,et al.  Data Stewardship: Environmental Data Curation and a Web-of-Repositories , 2009, Int. J. Digit. Curation.

[9]  David Gross-Amblard,et al.  Multimedia and metadata watermarking driven by application constraints , 2006, 2006 12th International Multi-Media Modelling Conference.

[10]  Jennifer Widom,et al.  Lineage tracing for general data warehouse transformations , 2003, The VLDB Journal.

[11]  Wang Chiew Tan Containment of Relational Queries with Annotation Propagation , 2003, DBPL.

[12]  James Cheney,et al.  Provenance management in curated databases , 2006, SIGMOD Conference.

[13]  M. Atallah,et al.  Watermarking Relational Databases , 2002 .

[14]  Ingemar J. Cox,et al.  The First 50 Years of Electronic Watermarking , 2002, EURASIP J. Adv. Signal Process..

[15]  Jessica J. Fridrich,et al.  Comparing robustness of watermarking techniques , 1999, Electronic Imaging.

[16]  Chaki Ng,et al.  Provenance-Aware Sensor Data Storage , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).