Distributed Deep Reinforcement Learning using TensorFlow

Deep Reinforcement Learning is the combination of Reinforcement Learning algorithms with Deep neural network, which had recent success in learning complicated unknown environments. The trained model is a Convolutional Neural Network trained using Q-Learning Loss value. The agent takes in observation, i.e. raw pixel image and reward from the environment for each step as input. The deep Q-learning algorithm gives out the optimal action for every observation and reward pair. The hyperparameters of Deep Q-Network remain unchanged for any environment. TensorFIow, an open source machine learning and numerical computation library is used to implement the deep Q-Learning algorithm on GPU. The distributed TensorFIow architecture is used to maximize the hardware resource utilization and reduce the training time. The usage of Graphics Processing Unit (GPU) in the distributed environment accelerated the training of deep Q-network. On implementing the deep Q-learning algorithm for many environments from OpenAI Gym, the agent outperforms a decent human reference player with few days of training.