Study onMotion FormsofMobile Robots Generated byQ-Learning Process BasedonRewardDatabases

This paperinvestigates themotion formsofrobots generated bytheQ-Learning algorithm during thelearning process. We analyzed themannerinwhichacaterpillar robot, whichperforms looping motions using twoactuators, acquires advance actions byfocusing ontheprocess. Byobserving a series ofprocesses, we confirmed thatvarious motionforms appeared ordisappeared asaresult oftheir interactions with thelearning process andapproach anoptimum motion form. In mostalgorithms, suchmotionformscannotappearinthe learning process because its framework isalmost predetermined bytheteacher data, andthecostfunctions forlearning cannot be usually considered as a continuous process. The characteristics ofreinforcement learning areveryinteresting fromtheviewpoint ofbiological evolution. Thispaperdescribes theeffects oftheinteraction between therobotkinematics and theenvironment asadirect result ofchanging theenvironment. In addition, thisstudychallenged theacquisition of two-dimensional motions witha starfish robothaving four actuators. Theresult demonstrates thattherobot canobtain a reasonable motion fromthecomplicated relationships withthe environment byskillfully employing itsstructure. Moreover, this paperimplies thatthereward manipulation maygive anew insight forthelearning process bytheinvestigations performed inthis study. Thispaperexamines thepossibility ofthereward combinations forgenerating arbitrary motions. IndexTerms-Reinforcement learning, Q-Learning, motion form, mobile robot.