Augmented Workspace for Human-in-the-Loop Plan Execution

Limitations of natural language present a significant barrier towards adopting autonomous systems into safety critical workflows involving humans and machines. We propose to build on recent advances in augmented reality (AR) technology to develop alternative modes of communication between humans and robots. To this end we demonstrate how the Microsoft HoloLens can be used for projecting a robot’s goals and intentions to its human teammate, who can use these cues to engage in real-time collaborative plan execution with the robot. We hope that the proposed system will inspire research in augmenting human-robot interactions with such alternative forms of communications in the interests of safety, productivity and fluency of teaming, particularly in the manufacturing industry where the use of such wearables can be enforced. Effective planning for human robot teams not only involve the capacity to be “human-aware” during the plan generation process, but also require the ability to interact with the human during the plan execution phase. This is also emphasized in the Roadmap for U.S. Robotics report, which outline that “humans must be able to read and recognize robot activities in order to interpret the robot’s understanding”. At the core of this problem is the impedance mismatch between humans and robots in how they communicate. Despite the progress made in natural language processing, natural language understanding is still a largely unsolved problem, and as such robots find it difficult to express their own goals and intentions effectively. In this demonstration, we will show how this problem can be addressed using an alternative holographic vocabulary for communication which allows for real-time collaborative plan monitoring and execution of a robot with a human-in-the-loop. The Augmented Workspace We will now demonstrate different ways augmented reality can improve the humanrobot workspace. A video demonstrating these capabilities is available at https://goo.gl/bFkFDD. Perhaps the biggest use of AR in the context of planning is for interactive plan execution e.g. a robot involved in an assembly task can project the objects it going to manipulate into the human’s point of view, and annotate them with holograms that correspond to intentions to use or pickup. The human can, in turn, access or claim a particular object https://www.microsoft.com/microsoft-hololens/en-us Figure 1: An Augmented Workspace for human in the loop operation of robots in the industry. in the virtual space and force the robot to re-plan, without there ever being any conflict of intentions in the real space. The human can thus not only infer the robot’s intent immediately from these holographic projections, but they can also interact with them to communicate their own intentions directly and thereby modify the robot’s behavior online. Figure 2 shows, in detail, one such use case in our favorite BlocksWorld domain. The human can go into finer control of the robot by accessing the Holographic Control Panel, as seen in Figure 3(a). The panel provides the human controls to start and stop execution of the robot’s plan, as well as achieve fine grained motion control of both the base and the arm by making it mimic he user’s arm motion gestures. The use of AR is, of course, not just restricted to procedural execution of plans. It can also be used to annotate the workspace with artifacts derived from the plan under execution to improve the fluency of collaboration e.g. Figure 3(b-e) shows the robot projecting its area of influence in its workspace. This can be very useful in determining safe zones around the robot. As seen in Figure 3(f-i), the robot can also render hidden objects or partially observable state variables relevant to a plan, as well as indicators to improve peripheral awareness of the human w.r.t. the robot. Figure 2: Interactive execution of a plan in BlocksWorld domain (a) First person view of the real workspace showing initial state. The robot wants to build a tower of height three with blocks blue, red and green. (b) Block are annotated with intuitive holograms, e.g. an upward arrow on the block the robot is going to pick up and a red cross mark on the ones it is planning to use later. The human can also gaze on an object for more information (in the rendered text). (c) & (d) The human pinches on the green block and claims it for himself. The robot now projects a faded out green block and re-plans to use the orange block instead (the pickup arrow shifts to the latter at this time). (e) Real-time update and rendering of the current state showing status of the plan and objects in the environment. (f) The robot completes its new plan using the orange block. Property AR MR Comments Interaction 3 7 One of the key features of AR is that it provides the humans with the ability to interact directly with the holograms. This becomes particularly difficult in MR, especially due to difficulties in accurate gaze and gesture estimation. Occlusion ? 7 Unlike MR, AR is not particularly disadvantaged by occlusions due to objects or agents in the workspace. However, HoloLens in particular does reduce the field of view significantly. Ergonomics 7 3 At present the size, weight and the occlusion of the peripheral view due to the HoloLens makes it somewhat unsuitable for longer operations, while the MR approach does not require any wearables and leaves the human mostly uninhibited. However, this is again expected to improve in later iterations of the HoloLens, as well as if they are custom made and optimized for a setting such as this. Scalability 3 ? MR will find it difficult to scale up to beyond peer-to-peer interactions or a confined space, given the requirement of viable projectors for every interaction. This is hardly an issue for the HoloLens which provides unrestricted mobility and portability of solutions. Scope 3 7 MR is limited by a 2D canvas (environment), whereas AR can not only provide 3D projections that can be interacted with but also can express information that 2D projections cannot e.g. a 3D volume of safety around the robot rather than just the projected area on the floor. Table 1: Relative merits of Augmented Reality (AR) and Existing Mixed Reality (MR) approaches for intention projection. Figure 3: Interactive plan execution using the (a) Holographic Control Panel. Safety cues showing dynamic real-time rendering of volume of influence (b) (c) or area of influence (d) (e), as well as (i) indicators for peripheral awareness. Interactive rendering of hidden objects (f) (h) to improve observability and situational awareness in complex workspaces. Acknowledgements.2 The work is supported in part by the NASA grant NNX17AD06G and ONR grants N000141612892, N00014-13-1-0176, N00014-13-1-0519, N00014-15-12027. The first author is also supported by the IBM Ph.D. Fellowship. In general, the work considers other approaches to bridging the impedance mismatch between the robot and the human through the use of wearables, e.g. EEG headsets. More details can be found here https://arxiv.org/abs/1703.08930.