Semantic High Level Querying in Sensor Networks

The quick development and deployment of sensor technology within the general frame of the Internet of Things poses relevant opportunity and challenges. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated metadata and it is a building block of a “worldwide distributed” real time database, to be processed through real-time queries. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries) and to apply “intelligence” and processing capabilities as close as possible to the source of data. This paper presents the extension of a general architecture for data integration in which we add capabilities for processing of complex queries and discuss how they can be adapted to, and used by, an application in the Semantic Sensor Web, presenting a pilot study in environment and health domains. 1 Background and Motivation The rapid development and deployment of sensor technology involves many different types of sensors, both remote and in situ, with such diverse capabilities as range, modality, and manoeuvrability. It is possible today to utilize networks with multiple sensors to detect and identify objects of interest up close or from a great distance. Connected Objects – or the Internet of Things – is expected to be a significant new market and encompass a large variety of technologies and services in different domains. Transport, environmental management, health, agriculture, domestic appliances, building automation, energy efficiency will benefit of real-time reality mining, personal decision support capabilities provided by the growing information shadow (i.e. data traces) of people, goods and objects supplied by the huge data available from the emerging sensor Web [1]. Vertical applications can be developed to connect to and communicate with objects tailored for specific sub domains, service enablement to face fragmented connectivity, device standards, application information protocols etc. and device management. Building extending connectivity, connectivity tailored for object communication – with regards to business model, service level, billing etc, are possible exploitation areas of the Internet Connected Objects. Important challenges are to achieve interoperability in connectivity and processing capabilities (queries, etc.), to distribute “intelligence” and processing capabilities as close as possible to the source of data (the Giordani I., Toscani D., Archetti F. and Cislaghi M.. Semantic High Level Querying in Sensor Networks. DOI: 10.5220/0003116600720084 In Proceedings of the International Workshop on Semantic Sensor Web (SSW-2010), pages 72-84 ISBN: 978-989-8425-33-1 Copyright c 2010 SCITEPRESS (Science and Technology Publications, Lda.) sensor or mobile device), in order to avoid massive data flows and bottlenecks on the connectivity side. The sensor is not a pure data source, but an entity (Semantic Sensor Web) with associated domain metadata, capable of autonomous processing and it is a building block of a “worldwide distributed” real time database, to be processed through realtime queries. The vision of the Semantic Sensor Web promises to unify the real and the virtual world by integrating sensor technologies and Semantic Web technologies. Sensors and their data will be formally described and annotated in order to facilitate the common integration, discovery and querying of information. Since this semantic information ultimately needs to be communicated by the sensors themselves, one may wonder whether existing techniques for processing, querying and modeling sensor data are still applicable under this increased load of transmitted data. In the following of this paper we introduce the state of the art in data querying over network of data providers. In Sect. 2 we present the software architecture of a data integration system in which we added complex query processing features. Sect. 3 introduces the case study in which we deployed our system: the study of short term effect of air pollution on health. Sect. 4 presents the detailed implementation of the querying features together with results on real data sets. Finally, Sect 5 presents the conclusions and future work. 1.1 State of the Art This paper stems from the work presented in [12], in which is presented a software system aimed at forecasting the demand of patient admissions on health care structures due to environmental pollution. The target users of this decision sup-port tool are health care managers and public administrators, which need help in resource allocation and policies implementation. The key feature of that system was the algorithmic kernel, to perform time series analysis through Autoregressive Hidden Markov Models (AHMM) [7]. The scenario in which the system has been deployed is the research project LENVIS1, which is aimed to create a network of services for data and information sharing based on heterogeneous and distributed data sources and modeling. One of the innovations brought by LENVIS is the “service oriented business intelligence”, i.e. an approach to Business Intelligence in which the information presented to the user comes from data processing that is performed online, i.e. data are extracted under request of the applications, and on the basis of data availability, i.e. data are exchanged through web services, which does not guarantee response time neither availability. Such a complex environment, in which data sources are distributed over the internet, is common to several problems and has been faced by different approaches. One of them is that of [13], in which “monitoring queries” continuously collect data about spatially-related physical phenomena. An algorithm, called Adaptive Pocket Driven Trajectories, is used to select data collection paths based on the spatial layout of sen1 LENVIS Localised environmental and health information services for all. FP7-ICT-2007-2. Project number 223925. www.lenvis.eu 73