With rapid development of smart hardwares and networking protocols, more and more IoT sensors are becoming publicly accessible through the Internet. Many semantic enhanced IoT sensors store the captured events in their description files, making the build of a generic IoT search engine possible. Crawling the events captured by these sensors is a fundamental step towards building this IoT search engine. However, this step faces a challenge due to sensors' sleep behavior and limited energy supply. Using traditional web access strategy for IoT application may cause unpredictable latency in receiving events with low power efficiency. In this paper, firstly the issue how to crawl newly captured events from periodically sleeping sensors is formulated as a schedule problem, which can be solved by constrained optimization. We take expected latency as the optimization object, as this indicates whether the wanted events can be gathered by crawlers in time. Then a sleep-aware schedule method, named EasiCrawl, is proposed for achieving near-optimal expected latency in receiving events. Finally, EasiCrawl is evaluated by simulations and a case study with real-world data from Xively. The simulation results show that EasiCrawl has lower latency than the periodic and greedy crawl strategy.
[1]
Amit P. Sheth,et al.
The SSN ontology of the W3C semantic sensor network incubator group
,
2012,
J. Web Semant..
[2]
Kamin Whitehouse,et al.
Semantic Streams: A Framework for Composable Semantic Interpretation of Sensor Data
,
2006,
EWSN.
[3]
Philip S. Yu,et al.
Optimal crawling strategies for web search engines
,
2002,
WWW '02.
[4]
Kay Römer,et al.
SPITFIRE: toward a semantic web of things
,
2011,
IEEE Communications Magazine.
[5]
Wei Hong,et al.
TinyDB: an acquisitional query processing system for sensor networks
,
2005,
TODS.
[6]
Stephen P. Boyd,et al.
Convex Optimization
,
2004,
Algorithms and Theory of Computation Handbook.
[7]
Hector Garcia-Molina,et al.
Synchronizing a database to improve freshness
,
2000,
SIGMOD '00.
[8]
Mark S. Squillante,et al.
Efficiently serving dynamic data at highly accessed web sites
,
2004,
IEEE/ACM Transactions on Networking.
[9]
Zach Shelby,et al.
Constrained RESTful Environments (CoRE) Link Format
,
2012,
RFC.