In an Internet of Things (IoT) environment, a user may need to receive a set of services from the environment to accomplish his or her own task. Then, appropriate IoT devices that are necessary to provide the services should be located and accessed in a way that the functional effects of the IoT services can be delivered effectively to the user. Especially, it is more challenging to ensure the successful delivery of service effects when users and/or IoT devices have mobility in the environment. In this paper, we extend our previous work of spatially-cohesive service selection by adopting a reinforcement learning technique to effectively deal with highly dynamic situations in mobile IoT environments. We propose a reinforcement learning agent that selects services in terms of associating IoT devices in a dynamic manner while optimizing metrics such as spatio-cohesiveness and number of hand-overs during the period of providing the services. Our approach is evaluated by simulating the agent, and the results show that the agent successfully learns the optimal policy of the service selection.
[1]
Yishay Mansour,et al.
Policy Gradient Methods for Reinforcement Learning with Function Approximation
,
1999,
NIPS.
[2]
Yuval Tassa,et al.
Continuous control with deep reinforcement learning
,
2015,
ICLR.
[3]
Joseph G. Davis,et al.
Service Selection in Web Service Composition: A Comparative Review of Existing Approaches
,
2014,
Web Services Foundations.
[4]
Hang Xu,et al.
A Reinforcement Learning Method for Constraint-Satisfied Services Composition
,
2020,
IEEE Transactions on Services Computing.
[5]
Valérie Issarny,et al.
Revisiting Service-Oriented Architecture for the IoT: A Middleware Perspective
,
2016,
ICSOC.
[6]
John N. Tsitsiklis,et al.
Actor-Critic Algorithms
,
1999,
NIPS.