论文信息 - Necessary and sufficient Karush-Kuhn-Tucker conditions for multiobjective Markov chains optimality

Necessary and sufficient Karush-Kuhn-Tucker conditions for multiobjective Markov chains optimality

The solution concepts proposed in this paper follow the Karush-Kuhn-Tucker (KKT) conditions for a Pareto optimal solution in finite-time, ergodic and controllable Markov chains multi-objective programming problems. In order to solve the problem we introduce the Tikhonov's regularizator for ensuring the objective function is strict-convex. Then, we consider the c -variable method for introducing equality constraints that guarantee the result belongs to the simplex and satisfies ergodicity constraints. Lastly, we restrict the cost-functions allowing points in the Pareto front to have a small distance from one another. The computed image points give a continuous approximation of the whole Pareto surface. The constraints imposed by the c -variable method make the problem computationally tractable and, the restriction imposed by the small distance change ensures the continuation of the Pareto front. We transform the multi-objective nonlinear problem into an equivalent nonlinear programming problem by introducing the Lagrange function multipliers. As a result we obtain that the objective function is strict-convex, the inequality constraints are continuously differentiable and the equality constraint is an affine function. Under these settings, the KKT optimality necessary and sufficient conditions are elicited naturally. A numerical example is solved for providing the basic techniques to compute the Pareto optimal solutions by resorting to KKT conditions.

Julio B. Clempner

[1] Lothar Thiele,et al. Quality Assessment of Pareto Set Approximations , 2008, Multiobjective Optimization.

[2] Marcello Restelli,et al. Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation , 2014, AAAI.

[3] Yacov Y. Haimes,et al. Kuhn-Tucker multipliers as trade-offs in multiobjective decision-making analysis , 1979, Autom..

[4] Margaret M. Wiecek,et al. Generating epsilon-efficient solutions in multiobjective programming , 2007, Eur. J. Oper. Res..

[5] Thomas A. Henzinger,et al. Markov Decision Processes with Multiple Objectives , 2006, STACS.

[6] Shimon Whiteson,et al. Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty , 2014, ICAPS.

[7] Joydeep Dutta,et al. A new scalarization and numerical method for constructing the weak Pareto front of multi-objective optimization problems , 2011 .

[8] Jerzy A. Filar,et al. Multiobjective Markov decision process with average reward criterion , 1986 .

[9] Ulf Schlichtmann,et al. A Successive Approach to Compute the Bounded Pareto Front of Practical Multiobjective Optimization Problems , 2009, SIAM J. Optim..

[10] Alexander S. Poznyak,et al. Computing the strong Nash equilibrium for Markov chains games , 2015, Appl. Math. Comput..

[11] Shimon Whiteson,et al. A Survey of Multi-Objective Sequential Decision-Making , 2013, J. Artif. Intell. Res..

[12] Alexander S. Poznyak,et al. A priori-knowledge/actor-critic reinforcement learning architecture for computing the mean-variance customer portfolio: The case of bank marketing campaigns , 2015, Eng. Appl. Artif. Intell..

[13] Tuomas Sandholm,et al. Algorithms for Strong Nash Equilibrium with More than Two Agents , 2013, AAAI.

[14] Srini Narayanan,et al. Learning all optimal policies with multiple criteria , 2008, ICML '08.

[15] Alexander S. Poznyak,et al. Solving the mean-variance customer portfolio in Markov chains using iterated quadratic/Lagrange programming: A credit-card customer limits approach , 2015, Expert Syst. Appl..

[16] Andrei V. Kelarev,et al. Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks , 2009, Australasian Conference on Artificial Intelligence.

[17] Alexander S. Poznyak,et al. Solving the Pareto front for multiobjective Markov chains using the minimum Euclidean distance gradient-based optimization method , 2016, Math. Comput. Simul..