Multi-objective nested algorithms for optimal reservoir operation

The optimal reservoir operation is in general a multi-objective problem, meaning that multiple objectives are to be considered at the same time. For solving multi-objective optimization problems there exist a large number of optimization algorithms – which result in a generation of a Pareto set of optimal solutions (typically containing a large number of them), or more precisely, its approximation. At the same time, due to the complexity and computational costs of solving full-fledge multi-objective optimization problems some authors use a simplified approach which is generically called " scalarization ". Scalarization transforms the multi-objective optimization problem to a single-objective optimization problem (or several of them), for example by (a) single objective aggregated weighted functions, or (b) formulating some objectives as constraints. We are using the approach (a). A user can decide how many multi-objective single search solutions will generate, depending on the practical problem at hand and by choosing a particular number of the weight vectors that are used to weigh the objectives. It is not guaranteed that these solutions are Pareto optimal, but they can be treated as a reasonably good and practically useful approximation of a Pareto set, albeit small. It has to be mentioned that the weighted-sum approach has its known shortcomings because the linear scalar weights will fail to find Pareto-optimal policies that lie in the concave region of the Pareto front. In this context the considered approach is implemented as follows: there are m sets of weights {w1i,. .. wni} (i starts from 1 to m), and n objectives applied to single objective aggregated weighted sum functions of nested dynamic programming (nDP), nested stochastic dynamic programming (nSDP) and nested reinforcement learning (nRL). By employing the multi-objective optimization by a sequence of single-objective optimization searches approach, these algorithms acquire the multi-objective properties. These algorithms have been denoted as multi-objective nDP (MOnDP), multi-objective nSDP (MOnSDP) and multi-objective nRL (MOnRL). The MOnDP, MOnSDP and MOnRL algorithms were tested at the multipurpose reservoir Knezevo of the Zletovica hydro-system located in the Republic of Macedonia, with eight objectives, including urban water supply, agriculture, ensuring ecological flow, and generation of hydropower. The MOnSDP and MOnRL derived/learned the optimal reservoir policy using 45 (1951-1995) years and were tested on 10 (1995-2005) years historical data, and compared with MOnDP optimal reservoir operation in the same period. The found solutions form the Pareto optimal set in eight dimensional objective function space (since the eight …