A Survey on Big Data Market: Pricing, Trading and Protection

Big data is considered to be the key to unlocking the next great waves of growth in productivity. The amount of collected data in our world has been exploding due to a number of new applications and technologies that permeate our daily lives, including mobile and social networking applications, and Internet of Thing-based smart-world systems (smart grid, smart transportation, smart cities, and so on). With the exponential growth of data, how to efficiently utilize the data becomes a critical issue. This calls for the development of a big data market that enables efficient data trading. Via pushing data as a kind of commodity into a digital market, the data owners and consumers are able to connect with each other, sharing and further increasing the utility of data. Nonetheless, to enable such an effective market for data trading, several challenges need to be addressed, such as determining proper pricing for the data to be sold or purchased, designing a trading platform and schemes to enable the maximization of social welfare of trading participants with efficiency and privacy preservation, and protecting the traded data from being resold to maintain the value of the data. In this paper, we conduct a comprehensive survey on the lifecycle of data and data trading. To be specific, we first study a variety of data pricing models, categorize them into different groups, and conduct a comprehensive comparison of the pros and cons of these models. Then, we focus on the design of data trading platforms and schemes, supporting efficient, secure, and privacy-preserving data trading. Finally, we review digital copyright protection mechanisms, including digital copyright identifier, digital rights management, digital encryption, watermarking, and others, and outline challenges in data protection in the data trading lifecycle.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Jiang Yue,et al.  Competition and Cooperation between participants of the Internet of Things Industry Value Chain , 2012 .

[3]  Seref Sagiroglu,et al.  Big data: A review , 2013, 2013 International Conference on Collaboration Technologies and Systems (CTS).

[4]  J. Nash NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.

[5]  Xiaolong Xu,et al.  Big data challenges and opportunities in the hype of Industry 4.0 , 2017, 2017 IEEE International Conference on Communications (ICC).

[6]  Guillermo Navarro-Arribas,et al.  A Fair Protocol for Data Trading Based on Bitcoin Transactions , 2017, IACR Cryptol. ePrint Arch..

[7]  Yi-Shun Wang,et al.  The stickiness intention of group-buying websites: The integration of the commitment-trust theory and e-commerce success model , 2016, Inf. Manag..

[8]  Mohsen Guizani,et al.  Internet of Things: A Survey on Enabling Technologies, Protocols, and Applications , 2015, IEEE Communications Surveys & Tutorials.

[9]  Dan Suciu,et al.  Query-Based Data Pricing , 2015, J. ACM.

[10]  John E. Hogan,et al.  The Strategy and Tactics of Pricing: New International Edition , 2013 .

[11]  Borko Furht,et al.  Introduction to Big Data , 2016, Big Data Technologies and Applications.

[12]  Xinyu Yang,et al.  SODA: Strategy-Proof Online Double Auction Scheme for Multimicrogrids Bidding , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[13]  Shaojie Tang,et al.  An Online Pricing Mechanism for Mobile Crowdsensing Data Markets , 2017, MobiHoc.

[14]  Jiying Zhao,et al.  Adaptive Watermarking and Tree Structure Based Image Quality Estimation , 2014, IEEE Transactions on Multimedia.

[15]  Leandros Tassiulas,et al.  An iterative double auction for mobile data offloading , 2013, 2013 11th International Symposium and Workshops on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt).

[16]  Yaping Lin,et al.  Secure Ranked Multi-keyword Search for Multiple Data Owners in Cloud Computing , 2014, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[17]  Teng Wang,et al.  Survey on Improving Data Utility in Differentially Private Sequential Data Publishing , 2017, IEEE Transactions on Big Data.

[18]  Zhu Han,et al.  Data Collection and Wireless Communication in Internet of Things (IoT) Using Economic Analysis and Pricing Models: A Survey , 2016, IEEE Communications Surveys & Tutorials.

[19]  Joan Van Tassel Digital Rights Management: Protecting and Monetizing Content , 2006 .

[20]  Rosario Gennaro,et al.  Homomorphic Secret Sharing from Paillier Encryption , 2017, ProvSec.

[21]  Gregor Berz Game theory bargaining and auction strategies : practical examples from internet auctions to investment banking , 2015 .

[22]  Melnned M. Kantardzic Big Data Analytics , 2013, Lecture Notes in Computer Science.

[23]  Hyoungshick Kim,et al.  Bypassing the Integrity Checking of Rights Objects in OMA DRM: a Case Study with the MelOn Music Service , 2016, IMCOM.

[24]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[25]  Behzad Moshiri,et al.  Inner Supervision in Multi-Sensor Data Fusion Using the Concepts of Stackelberg Games , 2006, 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[26]  Weijia Jia,et al.  Novel Packet Size-Based Covert Channel Attacks against Anonymizer , 2013, IEEE Transactions on Computers.

[27]  Xinyu Yang,et al.  A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications , 2017, IEEE Internet of Things Journal.

[28]  Tianqi Zhou,et al.  TPAHS: A Truthful and Profit Maximizing Double Auction for Heterogeneous Spectrums , 2016, 2016 IEEE Trustcom/BigDataSE/ISPA.

[29]  Kang G. Shin,et al.  Differentially private and strategy-proof spectrum auction with approximate revenue maximization , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[30]  Florian Stahl,et al.  Pricing Approaches for Data Markets , 2012, BIRTE.

[31]  Mauro Iacono,et al.  Performance evaluation of NoSQL big-data applications using multi-formalism models , 2014, Future Gener. Comput. Syst..

[32]  Andrea Zanella,et al.  Internet of Things for Smart Cities , 2014, IEEE Internet of Things Journal.

[33]  K. J. Ray Liu,et al.  Data Trading With Multiple Owners, Collectors, and Users: An Iterative Auction Mechanism , 2017, IEEE Transactions on Signal and Information Processing over Networks.

[34]  Fangfang Li,et al.  A robust and synthesized-unseen watermarking for the DRM of DIBR-based 3D video , 2017, Neurocomputing.

[35]  Yanchao Zhang,et al.  Privacy-Preserving Crowdsourced Spectrum Sensing , 2018, IEEE/ACM Transactions on Networking.

[36]  Man Hon Cheung,et al.  Economics of mobile data trading market , 2017, 2017 15th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt).

[37]  Sven de Vries,et al.  Combinatorial Auctions: A Survey , 2003, INFORMS J. Comput..

[38]  Rudolf Mumenthaler 1.4 Digital Rights Management (DRM) , 2017 .

[39]  André Hohmann Rights Expression Languages in Libraries : Development of an Application Profile , 2016 .

[40]  William H. Sanders,et al.  The Mobius modeling tool , 2001, Proceedings 9th International Workshop on Petri Nets and Performance Models.

[41]  Brent Waters,et al.  Homomorphic Encryption from Learning with Errors: Conceptually-Simpler, Asymptotically-Faster, Attribute-Based , 2013, CRYPTO.

[42]  Vinod Vaikuntanathan,et al.  Efficient Fully Homomorphic Encryption from (Standard) LWE , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[43]  Man Lung Yiu,et al.  Query Optimization over Cloud Data Market , 2015, EDBT.

[44]  Xinyu Yang,et al.  A strategy-proof privacy-preserving double auction mechanism for electrical vehicles demand response in microgrids , 2017, 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC).

[45]  Guihai Chen,et al.  A General Privacy-Preserving Auction Mechanism for Secondary Spectrum Markets , 2016, IEEE/ACM Transactions on Networking.

[46]  Peter Groves,et al.  The 'big data' revolution in healthcare: Accelerating value and innovation , 2016 .

[47]  Klara Nahrstedt,et al.  Enabling PrivacyPreserving PrivacyPreserving PrivacyPreserving Incentives for Mobile Crowd Sensing Systems , 2016 .

[48]  Lawrence M. Ausubel,et al.  The Lovely but Lonely Vickrey Auction , 2004 .

[49]  Xuan Zhou,et al.  Buying on Margin and Short Selling in an Artificial Double Auction Market , 2019 .

[50]  Yuxing Mao,et al.  A Strategic Bargaining Game for a Spectrum Sharing Scheme in Cognitive Radio-Based Heterogeneous Wireless Sensor Networks , 2017, Sensors.

[51]  Tarek Gaber,et al.  Digital Rights Management: Open Issues to Support E-Commerce , 2013 .

[52]  Haifei Yu,et al.  Data pricing strategy based on data quality , 2017, Comput. Ind. Eng..

[53]  Xue-wen Chen,et al.  Big Data Deep Learning: Challenges and Perspectives , 2014, IEEE Access.

[54]  Alagan Anpalagan,et al.  Joint Power Coordination for Spectral-and-Energy Efficiency in Heterogeneous Small Cell Networks: A Bargaining Game-Theoretic Perspective , 2016, IEEE Transactions on Wireless Communications.

[55]  Si’en Chen,et al.  Analytics: The real-world use of big data in financial services studying with judge system events , 2016, Journal of Shanghai Jiaotong University (Science).

[56]  R. McAfee,et al.  A dominant strategy double auction , 1992 .

[57]  Bharat K. Bhargava,et al.  MPEG Video Encryption in Real-time Using Secret Key Cryptography , 1999, PDPTA.

[58]  Jingdong Xu,et al.  Auc2Reserve: A Differentially Private Auction for Electric Vehicle Fast Charging Reservation (Invited Paper) , 2016, 2016 IEEE 22nd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA).

[59]  Kim-Kwang Raymond Choo,et al.  An adversary model to evaluate DRM protection of video contents on iOS devices , 2016, Comput. Secur..

[60]  Abdorasoul Ghasemi,et al.  Pricing-based Stackelberg game for spectrum trading in self-organised heterogeneous networks , 2016, IET Commun..

[61]  Valeria Vittorini,et al.  The OsMoSys approach to multi-formalism modeling of systems , 2004, Software & Systems Modeling.

[62]  Chao Lu,et al.  Privacy-Preserving Auction for Big Data Trading Using Homomorphic Encryption , 2020, IEEE Transactions on Network Science and Engineering.

[63]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[64]  Tao Xiang,et al.  User Differentiated Verifiable File Search on the Cloud , 2018, IEEE Transactions on Services Computing.

[65]  Dusit Niyato,et al.  Profit Maximization Auction and Data Management in Big Data Markets , 2017, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[66]  Carmela Troncoso,et al.  Engineering Privacy by Design , 2011 .

[67]  Xinyu Yang,et al.  Towards double auction for assisting electric vehicles demand response in smart grid , 2017, 2017 13th IEEE Conference on Automation Science and Engineering (CASE).

[68]  Xinyu Yang,et al.  On binary decomposition based privacy-preserving aggregation schemes in real-time monitoring systems , 2015, 2015 IEEE International Conference on Communications (ICC).

[69]  K. J. Ray Liu,et al.  An iterative auction mechanism for data trading , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[70]  Adam Barker,et al.  Undefined By Data: A Survey of Big Data Definitions , 2013, ArXiv.

[71]  Xinyu Yang,et al.  A Real-Time En-Route Route Guidance Decision Scheme for Transportation-Based Cyberphysical Systems , 2017, IEEE Transactions on Vehicular Technology.

[72]  Sachin Shetty,et al.  Stackelberg-Game-Based Dynamic Spectrum Access in Heterogeneous Wireless Systems , 2016, IEEE Systems Journal.

[73]  S. Rosen,et al.  Monopoly and product quality , 1978 .

[74]  Mauro Iacono,et al.  Defining Formalisms for Performance Evaluation With SIMTHESys , 2011, PASM@ICPE.

[75]  E. Fama,et al.  Commodity futures prices: some evidence on forecast power , 1987 .

[76]  Harish Barapatre,et al.  Software Piracy Protection , 2017 .

[77]  Verena Kantere,et al.  Predicting cost amortization for query services , 2011, SIGMOD '11.

[78]  Jing Yang,et al.  The Efficiency of an Artificial Double Auction Stock Market with Neural Learning Agents , 2002 .

[79]  Timothy F. Bresnahan,et al.  The oligopoly solution concept is identified , 1982 .

[80]  Hakan Hacigümüs,et al.  Online optimization and fair costing for dynamic data sharing in a cloud data market , 2014, SIGMOD Conference.

[81]  Avanish Kushal,et al.  Pricing for Data Markets , 2011 .

[82]  Ling Li,et al.  A Bid Evaluation Method for Multi-attribute Online Reverse Auction , 2017 .

[83]  T. Davenport Competing on analytics. , 2006, Harvard business review.

[84]  Houbing Song,et al.  Internet of Things and Big Data Analytics for Smart and Connected Communities , 2016, IEEE Access.

[85]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[86]  M. Chiang,et al.  Smart Data Pricing (SDP): Economic Solutions to Network Congestion , 2013 .

[87]  Weijia Jia,et al.  A New Cell-Counting-Based Attack Against Tor , 2012, IEEE/ACM Transactions on Networking.

[88]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[89]  Beatriz Lorenzo,et al.  A matching game for data trading in operator-supervised user-provided networks , 2016, 2016 IEEE International Conference on Communications (ICC).

[90]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[91]  Rajiv Ranjan,et al.  End-to-End Privacy for Open Big Data Markets , 2015, IEEE Cloud Computing.

[92]  Sheng Zhong,et al.  Joint Differentially Private Gale–Shapley Mechanisms for Location Privacy Protection in Mobile Traffic Offloading Systems , 2016, IEEE Journal on Selected Areas in Communications.

[93]  Saeid Nahavandi,et al.  System Design Perspective for Human-Level Agents Using Deep Reinforcement Learning: A Survey , 2017, IEEE Access.

[94]  Viju Raghupathi,et al.  Big data analytics in healthcare: promise and potential , 2014, Health Information Science and Systems.

[95]  Honggang Zhang,et al.  Effective Mobile Data Trading in Secondary Ad-hoc Market with Heterogeneous and Dynamic Environment , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[96]  Andrea Roncoroni Commodity Price Models , 2010 .

[97]  Xinwen Fu,et al.  DSSS-Based Flow Marking Technique for Invisible Traceback , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[98]  Jianzhong Li,et al.  A Fair Data Market System with Data Quality Evaluation and Repairing Recommendation , 2015, APWeb.

[99]  Mauro Iacono,et al.  The SIMTHESys multiformalism modeling framework , 2012, Comput. Math. Appl..

[100]  Yang Zhang,et al.  Towards truthful auction for big data trading , 2017, 2017 IEEE 36th International Performance Computing and Communications Conference (IPCCC).

[101]  Paul F. Syverson,et al.  Onion routing , 1999, CACM.

[102]  Sachchidanand Singh,et al.  Big Data analytics , 2012 .

[103]  Catherine Rosenberg,et al.  A game theoretic framework for bandwidth allocation and pricing in broadband networks , 2000, TNET.

[104]  Mehul Motani,et al.  Price-Based Resource Allocation for Spectrum-Sharing Femtocell Networks: A Stackelberg Game Approach , 2012, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[105]  Xuanyu Cao User Behavior Analysis and Data Trading in Multi-Agent Systems , 2017 .

[106]  John A. Stankovic,et al.  Research Directions for the Internet of Things , 2014, IEEE Internet of Things Journal.

[107]  Xuliang Duan,et al.  A Pricing Model for Big Personal Data , 2016 .

[108]  Xinwen Fu,et al.  CAP: A Context-Aware Privacy Protection System for Location-Based Services , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[109]  Charith Perera Sensing as a Service (S2aaS): Buying and Selling IoT Data , 2017, ArXiv.

[110]  Bill Franks,et al.  Taming The Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics , 2012 .

[111]  Xinyu Yang,et al.  Towards Multistep Electricity Prices in Smart Grid Electricity Markets , 2016, IEEE Transactions on Parallel and Distributed Systems.

[112]  Noam Nisan,et al.  Computationally feasible VCG mechanisms , 2000, EC '00.

[113]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[114]  Shahriar Akter,et al.  How ‘Big Data’ Can Make Big Impact: Findings from a Systematic Review and a Longitudinal Case Study , 2015 .

[115]  Nada Golmie,et al.  An Integrated Simulation Study on Reliable and Effective Distributed Energy Resources in Smart Grid , 2017, RACS.

[116]  Juan de Lara,et al.  AToM3: A Tool for Multi-formalism and Meta-modelling , 2002, FASE.

[117]  Rui Zhang,et al.  User-Initiated Data Plan Trading via a Personal Hotspot Market , 2016, IEEE Transactions on Wireless Communications.

[118]  Zhu Han,et al.  Smart data pricing models for the internet of things: a bundling strategy approach , 2016, IEEE Network.

[119]  Iftikhar Ahmad,et al.  A New Robust Video Watermarking Technique Using H.264/AAC Codec Luma Components Based On DCT , 2016 .

[120]  Guobin Xu,et al.  A Cloud Computing Based Network Monitoring and Threat Detection System for Critical Infrastructures , 2016, Big Data Res..

[121]  Silvana Trimi,et al.  Big-data applications in the government sector , 2014, Commun. ACM.

[122]  K. R. Conner,et al.  Software piracy: an analysis of protection strategies , 1991 .

[123]  Mauro Iacono,et al.  Element Based Semantics in Multi Formalism Performance Models , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[124]  J. Friedman A Non-cooperative Equilibrium for Supergames , 1971 .

[125]  Masayuki Iwasaki Introduction to “Big Data” in the enterprise , 2013 .

[126]  Wei Zhao,et al.  Design and Realization of WInternet , 2016, ACM Trans. Cyber Phys. Syst..

[127]  Said El Hajji,et al.  Secure Cloud Computing through Homomorphic Encryption , 2014, ArXiv.

[128]  Wei Yu,et al.  A cloud computing based architecture for cyber security situation awareness , 2013, 2013 IEEE Conference on Communications and Network Security (CNS).

[129]  Oliver Kirchkamp,et al.  Outside options: Another reason to choose the first-price auction , 2009 .

[130]  Yu-Chee Tseng,et al.  Time-Dependent Smart Data Pricing Based on Machine Learning , 2017, Canadian Conference on AI.

[131]  Samuel Fricker,et al.  Pricing of Data Products in Data Marketplaces , 2017, ICSOB.

[132]  Faramarz Hendessi,et al.  Cooperative primary–secondary dynamic spectrum leasing game via decentralized bargaining , 2016, Wirel. Networks.

[133]  Erik Brynjolfsson,et al.  Big data: the management revolution. , 2012, Harvard business review.

[134]  Xinyu Yang,et al.  Sto2Auc: A Stochastic Optimal Bidding Strategy for Microgrids , 2017, IEEE Internet of Things Journal.

[135]  A. Neely,et al.  Big Data for Big Business? A Taxonomy of Data-driven Business Models used by Start-up Firms , 2014 .

[136]  Kirti Jain,et al.  A Digital Video Watermarking Algorithm Based on LSB and DCT , 2016 .

[137]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[138]  Zhu Han,et al.  Market model and optimal pricing scheme of big data and Internet of Things (IoT) , 2016, 2016 IEEE International Conference on Communications (ICC).

[139]  Guihai Chen,et al.  Trading Data in Good Faith: Integrating Truthfulness and Privacy Preservation in Data Markets , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[140]  Xinyu Yang,et al.  A Survey on the Edge Computing for the Internet of Things , 2018, IEEE Access.

[141]  Wei Yu,et al.  Smart city: The state of the art, datasets, and evaluation platforms , 2017, 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS).

[142]  A. Sen,et al.  Rational Fools: A Critique of the Behavioral Foundations of Economic Theory , 1977 .

[143]  Baisa L. Gunjal,et al.  Privacy Preserving Ranked Multi-Keyword Search for Multiple Data Owners in Cloud Computing , 2016 .

[144]  Sha Li,et al.  Sealed-BID electronic auction without the third party , 2014, 2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP).

[145]  H. Demirkan,et al.  Pricing Strategies for Information Technology Services: A Value-Based Approach , 2009 .

[146]  Nada Golmie,et al.  Toward Integrating Distributed Energy Resources and Storage Devices in Smart Grid , 2017, IEEE Internet of Things Journal.

[147]  MengChu Zhou,et al.  VCG Auction-Based Dynamic Pricing for Multigranularity Service Composition , 2018, IEEE Transactions on Automation Science and Engineering.

[148]  Xinxin Niu,et al.  A Video Watermarking DRM Method Based on H.264 Compressed Domain with Low Bit-Rate Increasement , 2016 .