Learning rest-to-rest motor coordination in articulated mobile robots

In this dissertation we analyze insights about common approaches for the synthesis of motion behaviors in robots composed by joints, Current control frameworks are robust, complete, and include advanced features that compute efficient control actions, commanding the robot through desired workspace references while dealing with the complexity of the mechanism (redundancies, non-linearities, etc). Nevertheless, and from the point of view of the human interpretation of the motion, there is a lack in both the available type of realizable motions and the naturalness of the realizable ones. The main objective of this work is the conformation of a framework for the synthesis of motion behaviors defined by rest-to-rest movements in this type of robots. As well as in traditional methods, availability of initial kinematic information of the motion is assumed; however, trajectory references in workspace are not specified, letting this level of control unrestricted. The presented framework considers the robot as a set of actuated and unactuated Degrees Of Freedom (DOFs). The first are directly driven by torque commands while final states achieved by the unactuated DOFs are analyzed as a consequence of the joint acceleration profiles. Thus, the problem is formulated as the computation of certain adequate acceleration policies for the joints such that desired final values for the unactuated DOFs are fulfilled. Two main ideas command the perspective from which the problem is tackled. First, Dynamical Systems (DS) are used as policies for the actuated DOFs, and second, the indirect manipulation of the unactuated DOFs is understood as a coordination phenomenon. Using DSs as acceleration policies, convenient attractor properties are provided to the behavior of the joints. Moreover, the resulted behavior of the unactuated DOFs, e.g. robot's dynamic balance, becomes a direct consequence of the temporal properties of the mentioned DS. In order to obtain the desired whole-body behavior, initial joint trajectories are deformed such that its convergence properties remain valid but its resultant transition motion change, favoring the behavior of the unactuated DOFs. Deformation of a joint's profiles is attained imposing the linear combination of the initial local primitives as acceleration policy. The indirect control of the unactuated DOFs, through the weighted sum of the primitives, is interpreted as a motor coordination. Finally, a policy gradient reinforcement learning (PGRL) algorithm, adapted for the robotic scenario, is used as synthesis methodology. The performance of every combination of primitives is quantified, thus, the coordination weights can be optimized using a gradient descent iterative process.