Towards Reinforcement Learning of Haptic Search in 3D Environment

Due to a high computational cost, reinforcement learning in robotic applications is often based on non-sparse reward functions. We believe that in order to enable a multi-fingered anthropomorphic robot to autonomously perform haptic search in 3D environment, a reward function enforcing accurate classification between a target and a distractor is essential. To this end, we investigate performance of a target-distractor classifier that employs Generalized Matrix Learning Vector Quantization (GMLVQ). This method is particularly suitable for this task as it is designed to represent one class with several prototypes. This fits well to one of our main requirement of representing haptic exploration of a range of geometric features under one label. Apart from a suitable approach to classification, GMLVQ illustrates relevances of input dimensions, and can be used for data visualization in low-dimensional space.