Theorization, implementation, system architecture, and analysis of fast reinforcement learning techniques, with application to autonomous agents
暂无分享,去创建一个
Reinforcement learning is a major subset of machine learning. It is a computational approach to learning whereby an agent explores a complex and uncertain environment, perceives its current state, and takes actions that will eventually lead to a specific goal. The environment, in return, provides a reward reflecting the outcome of each action with respect to finding the optimal path to the goal. Reinforcement learning algorithms attempt to find a policy (policies) for maximizing a cumulative reward for the agent over the course of the learning process. The faster the cumulative reward reaches its maximum, the faster the agent learns the optimal path to the goal.
This dissertation addresses the concept of reinforcement learning and presents new approaches to offer a number of contributions to the field of reinforcement learning in particular and machine learning, artificial intelligence, and robotics in general. It presents and demonstrates robust techniques for fast reinforcement learning, with application to autonomous agents.
The first proposed reinforcement learning technique, we introduce and call the Multiple-Lookahead-Levels technique, grants the agent multiple lookahead levels of visibility in its environment which accelerates its learning process. The second proposed technique, called the Distance-Only based technique, rewards the agent based on its Euclidean distance to the goal, and the third proposed technique, called the Distance-and-Frequency based technique, uses the agent's state occurrence frequency in addition to its Euclidean distance to the goal to reward the agent.
The work in this dissertation demonstrates the potential of the proposed techniques and proves they lead to fast reinforcement learning. This work includes demonstrations of the theories behind the proposed techniques, implementations of these techniques, software-reconfigurable hardware design architecture for the Multiple-Lookahead-Levels technique, illustration on how the Distance-and-Frequency technique can be used in multi-agent environments, and analysis of their performance and effectiveness. The different stages of this work are supplemented by facts, equations, theorems, propositions, lemmas and their proofs of correctness, algorithms, experimental results, and analysis.