Optimizing activity-travel sequences by means of reinforcement learning

For the last decade, activity-based models have set the standards for modelling travel demand. The idea Behind these models is that travel demand is derived from activities that individuals and households need or whish to perform. Given a sequence of activities and transport modes, for which a framework has been provided in previous work, this paper evaluates the use of a Reinforcement Machine Learning technique. The technique simulates time and location allocation for these predicted sequences and enables the predicition of a more complete and consistent activity pattern. The main contributions of the paper to the current state-of-art are the allocation of location information in the simulation of activity-travel patterns as well as the application towards eralistic empirical data, the non-restriction to a given number of activities and the incorporation of realistic travel times. Furthermore, the time and location allocation problem were treated and integrated simultaneously, which means that the respondent's reward is not only maximized in terms of minimum travel duration, but also simultaneously in terms of optimal time allocation. A computer code has been established to automate the process and has been validated on empirical data. For the covering abstract of the conference see ITRD Abstract n°E218203.