Preference Elicitation and Explanation in Iterative Planning

Planning for complex scenarios, particularly in which large teams of humans with distributed expertise and varying preferences share a set of resources, poses a number of challenges. While the team as a collective has full knowledge of the task requirements, constraints, and all existing preferences of individuals or subteams, no individual in the team knows the full model of the task and preferences. Such a scenario could be an ideal context to leverage an automated planning agent. However, in many complex domains, there exist context-dependent preferences and constraints that vary with each planning episode, so encoding a static model to represent the planning scenario is not possible. [Smith, 2012] introduces the Mars Exploration Rover activity planning domain in which a group of science and engineering subteams with potentially competing preferences work together to develop the rover’s tactical activity plan. Throughout the planning process, information is aggregated through complex coordination structures between subteams and a resultant time-consuming iterative planning process. [Smith, 2012] highlights the need to integrate automated planning into an iterative process that begins before goals, objectives, and preferences are fully defined and outlines the technical implications for planning, including the need to naturally specify and utilize constraints in the planning process, generate multiple qualitatively different plans for analysis, and provide explanation of planning decisions. Given the technical implications laid out by [Smith, 2012], we see three key pieces to solving the problem of providing autonomous assistance through a mixed-initiative planning system when working in a complex domain. First, preferences of individuals or subteams must be elicited for consideration in planning [Berry et al., 2011]. Second, a plan must be generated that takes into account both hard constraints inherent to the problem and soft constraints elicited as preferences [Gerevini and Long, 2006]. Finally, when discrepancies between constraints or differing preferences occur, explanation as to the reason for infeasibility must be effectively communicated back to the humans in the loop such that they can more efficiently work towards replanning together with the autonomous system [Langley et al., 2017]. This process of preference elicitation, optimization, and explanation can be integrated as an iterative process by which teams can converge on the ideal schedule. Previous work by [Berry et al., 2011] develops a personalized time management agent that iteratively elicits user scheduling preferences, integrates them into meeting scheduling, and improves user preference modeling through an online learning process. One limitation of this work is that preferences are learned over a fixed set of objective terms, which could limit expressiveness of true user preferences. Further, feedback to the user is given in the form of candidate schedules, and no information as to why certain preferences were not accounted for is provided. We envision using Linear Temporal Logic (LTL) as a common language that provides a natural link between the three components of the iterative planning problem, facilitating both elicitation of expressive preferences and intelligible explanations of the system’s decision-making processes [Kim et al., 2017]. Outputs of each of the preference elicitation, planning, and explanation pieces can be used as inputs to each next step in the process. For example, preferences elicited as LTL specifications are readily inputted to an automated planner as soft constraints for planning, and the plan generated by the planner in combination with the preferences elicited can be used to generate relevant explanations. The system can leverage learned preference information relevant to each team member in providing explanations, aiding the team in converging on a schedule much faster than they can in such processes today. Further, since LTL is readily understandable by both automated planners and humans interacting with the system, it can be used to both describe human preferences as constraints on the planning problem and explain planner decisions in a concise way. In this thesis, we plan to explore the three individual components proposed and the integration of these pieces into an iterative process.