Authenticated Online Data Integration Services

Data integration involves combining data from multiple sources and providing users with a unified query interface. Data integrity has been a key problem in online data integration. Although a variety of techniques have been proposed to address the data consistency and reliability issues, there is little work on assuring the integrity of integrated data and the correctness of query results. In this paper, we take the first step to propose authenticated data integration services to ensure data and query integrity even in the presence of an untrusted integration server. We develop a novel authentication code called homomorphic secret sharing seal that can aggregate the inputs from individual sources faithfully by the untrusted server for future query authentication. Based on this, we design two authenticated index structures and authentication schemes for queries on multi-dimensional data. We further study the freshness problem in multi-source query authentication and propose several advanced update strategies. Analytical models and empirical results show that our seal design and authentication schemes are efficient and robust under various system settings.

[1]  Feifei Li,et al.  Authenticated Index Structures for Aggregation Queries , 2010, TSEC.

[2]  Moses Charikar,et al.  Finding frequent items in data streams , 2002, Theor. Comput. Sci..

[3]  Renée J. Miller,et al.  Discovering Linkage Points over Web Data , 2013, Proc. VLDB Endow..

[4]  Feifei Li,et al.  Dynamic authenticated index structures for outsourced databases , 2006, SIGMOD Conference.

[5]  Dawn Xiaodong Song,et al.  Secure hierarchical in-network aggregation in sensor networks , 2006, CCS '06.

[6]  Jianliang Xu,et al.  Authenticating Location-Based Skyline Queries in Arbitrary Subspaces , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  Ashwin Machanavajjhala,et al.  Information integration over time in unreliable and uncertain environments , 2012, WWW.

[8]  Bo Zhao,et al.  A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration , 2012, Proc. VLDB Endow..

[9]  Ramayya Krishnan,et al.  HYDRA: large-scale social identity linkage via heterogeneous behavior modeling , 2014, SIGMOD Conference.

[10]  Elaine Shi,et al.  Streaming Authenticated Data Structures , 2013, EUROCRYPT.

[11]  Dawn Xiaodong Song,et al.  Secure Distributed Data Aggregation , 2011, Found. Trends Databases.

[12]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[13]  Jeffrey Considine,et al.  Approximate aggregation techniques for sensor databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[14]  Philip S. Yu,et al.  Truth Discovery with Multiple Conflicting Information Providers on the Web , 2007, IEEE Transactions on Knowledge and Data Engineering.

[15]  Jianliang Xu,et al.  Authenticating Top-k Queries in Location-based Services with Confidentiality , 2013, Proc. VLDB Endow..

[16]  Man Lung Yiu,et al.  Authentication of moving kNN queries , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[17]  Yin Yang,et al.  Spatial Outsourcing for Location-based Services , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[18]  Yin Yang,et al.  Authenticated indexing for outsourced spatial databases , 2009, The VLDB Journal.

[19]  Suman Nath,et al.  Publicly verifiable grouped aggregation queries on outsourced data streams , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[20]  Aggelos Kiayias,et al.  Exact In-Network Aggregation with Integrity and Confidentiality , 2012, IEEE Transactions on Knowledge and Data Engineering.

[21]  Sunil Prabhakar,et al.  Trustworthy data from untrusted databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Cyrus Shahabi,et al.  Spatial Query Integrity with Voronoi Neighbors , 2013, IEEE Transactions on Knowledge and Data Engineering.

[23]  Kyriakos Mouratidis,et al.  Efficient verification of shortest path search via authenticated hints , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[24]  Dawn Xiaodong Song,et al.  SIA: secure information aggregation in sensor networks , 2003, SenSys '03.

[25]  Suman Nath,et al.  Secure outsourced aggregation via one-way chains , 2009, SIGMOD Conference.

[26]  Jianliang Xu,et al.  Authenticating location-based services without compromising location privacy , 2012, SIGMOD Conference.

[27]  Chi Zhang,et al.  Verifiable Privacy-Preserving Aggregation in People-Centric Urban Sensing Systems , 2013, IEEE Journal on Selected Areas in Communications.

[28]  Bo Zhao,et al.  Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation , 2014, SIGMOD Conference.

[29]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[30]  Andrea Calì,et al.  Data Integration under Integrity Constraints , 2004, CAiSE.

[31]  Kian-Lee Tan,et al.  Query assurance verification for outsourced multi-dimensional databases , 2009, J. Comput. Secur..

[32]  Rui Zhang,et al.  Verifiable Fine-Grained Top-k Queries in Tiered Sensor Networks , 2010, 2010 Proceedings IEEE INFOCOM.

[33]  Sencun Zhu,et al.  SDAP: a secure hop-by-Hop data aggregation protocol for sensor networks , 2006, MobiHoc '06.

[34]  Bernd-Uwe Pagel,et al.  Towards an analysis of range query performance in spatial data structures , 1993, PODS '93.

[35]  Surajit Chaudhuri,et al.  Dynamic sample selection for approximate query processing , 2003, SIGMOD '03.

[36]  Kian-Lee Tan,et al.  Authenticating query results in edge computing , 2004, Proceedings. 20th International Conference on Data Engineering.

[37]  Divesh Srivastava,et al.  Truth Finding on the Deep Web: Is the Problem Solved? , 2012, Proc. VLDB Endow..

[38]  Yun Peng,et al.  Authenticated Subgraph Similarity Searchin Outsourced Graph Databases , 2015, IEEE Transactions on Knowledge and Data Engineering.

[39]  Ralph C. Merkle,et al.  A Certified Digital Signature , 1989, CRYPTO.

[40]  Divesh Srivastava,et al.  Characterizing and selecting fresh data sources , 2014, SIGMOD Conference.

[41]  Stavros Papadopoulos,et al.  Lightweight authentication of linear algebraic queries on data streams , 2013, SIGMOD '13.

[42]  Kian-Lee Tan,et al.  Verifying completeness of relational query results in data publishing , 2005, SIGMOD '05.

[43]  Timos K. Sellis,et al.  A model for the prediction of R-tree performance , 1996, PODS.

[44]  Elisa Bertino,et al.  Authenticated Top-K Aggregation in Distributed and Outsourced Databases , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[45]  Yun Peng,et al.  Towards Efficient Authenticated Subgraph Query Service in Outsourced Graph Databases , 2014, IEEE Transactions on Services Computing.

[46]  Jan Chomicki,et al.  Consistent query answers in inconsistent databases , 1999, PODS '99.

[47]  Beng Chin Ooi,et al.  Online data fusion , 2011, Proc. VLDB Endow..

[48]  Jianliang Xu,et al.  Authentication of Moving Top-k Spatial Keyword Queries , 2015, IEEE Transactions on Knowledge and Data Engineering.

[49]  Andrea Calì,et al.  Data integration under integrity constraints , 2004, Inf. Syst..

[50]  Mihir Bellare,et al.  Fast Batch Verification for Modular Exponentiation and Digital Signatures , 1998, IACR Cryptol. ePrint Arch..