Towards a Quality-centric Big Data Architecture for Federated Sensor Services

As the Internet of Things (IoT) paradigm gains popularity, the next few years will likely witness 'servitization' of domain sensing functionalities. We envision a cloud-based eco-system in which high quality data from large numbers of independently-managed sensors is shared or even traded in real-time. Such an eco-system will necessarily have multiple stakeholders such as sensor data providers, domain applications that utilize sensor data (data consumers), and cloud infrastructure providers who may collaborate as well as compete. While there has been considerable research on wireless sensor networks, the challenges involved in building cloud-based platforms for hosting sensor services are largely unexplored. In this paper, we present our vision for data quality (DQ)-centric big data infrastructure for federated sensor service clouds. We first motivate our work by providing real-world examples. We outline the key features that federated sensor service clouds need to possess. This paper proposes a big data architecture in which DQ is pervasive throughout the platform. Our architecture includes a markup language called SDQ-ML for describing sensor services as well as for domain applications to express their sensor feed requirements. The paper explores the advantages and limitations of current big data technologies in building various components of the platform. We also outline our initial ideas towards addressing the limitations.

[1]  Michael Stonebraker,et al.  Monitoring Streams - A New Class of Data Management Applications , 2002, VLDB.

[2]  Srinivasan Seshan,et al.  IrisNet: An Architecture for Internet-scale Sensing Services , 2003, VLDB.

[3]  Jennifer Widom,et al.  STREAM: The Stanford Stream Data Manager , 2003, IEEE Data Eng. Bull..

[4]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM 2004.

[5]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[6]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[7]  Wei Hong,et al.  TinyDB: an acquisitional query processing system for sensor networks , 2005, TODS.

[8]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[9]  Alexander S. Szalay,et al.  Data Management in the Worldwide Sensor Web , 2007, IEEE Pervasive Computing.

[10]  Johannes Gehrke,et al.  Cayuga: a high-performance event processing engine , 2007, SIGMOD '07.

[11]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[12]  Kerry L. Taylor,et al.  A Framework for Semantic Sensor Network Services , 2008, ICSOC.

[13]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[14]  이태훈,et al.  Sensor Modeling Language(SensorML)을 이용한 환경 센서데이터 모델링 , 2008 .

[15]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[16]  Amit P. Sheth,et al.  Semantic Sensor Web , 2008, IEEE Internet Computing.

[17]  Erik Wilde Making Sensor Data Available Using Web Feeds , 2009 .

[18]  Erik Wilde Poster abstract: Making sensor data available using Web feeds , 2009, 2009 International Conference on Information Processing in Sensor Networks.

[19]  Deborah Estrin,et al.  SensLoc: sensing everyday places and paths using less energy , 2010, SenSys '10.

[20]  Samuel Madden,et al.  Database Abstractions for Managing Sensor Network Data , 2010, Proceedings of the IEEE.

[21]  Ian F. Akyildiz,et al.  Wireless Sensor Networks: Akyildiz/Wireless Sensor Networks , 2010 .

[22]  Gade Krishna,et al.  A scalable peer-to-peer lookup protocol for Internet applications , 2012 .

[23]  Davide Brunelli,et al.  Wireless Sensor Networks , 2012, Lecture Notes in Computer Science.

[24]  Dan Suciu,et al.  Query-Based Data Pricing , 2015, J. ACM.