Multi Criteria Reinforcement Learning Based on Goal-directed Exploration and its Application to Bipedal Walking Robot