Q-Learning Algorithms: A Comprehensive Classification and Applications

Q-learning is arguably one of the most applied representative reinforcement learning approaches and one of the off-policy strategies. Since the emergence of Q-learning, many studies have described its uses in reinforcement learning and artificial intelligence problems. However, there is an information gap as to how these powerful algorithms can be leveraged and incorporated into general artificial intelligence workflow. Early Q-learning algorithms were unsatisfactory in several aspects and covered a narrow range of applications. It has also been observed that sometimes, this rather powerful algorithm learns unrealistically and overestimates the action values hence abating the overall performance. Recently with the general advances of machine learning, more variants of Q-learning like Deep Q-learning which combines basic Q learning with deep neural networks have been discovered and applied extensively. In this paper, we thoroughly explain how Q-learning evolved by unraveling the mathematical complexities behind it as well its flow from reinforcement learning family of algorithms. Improved variants are fully described, and we categorize Q-learning algorithms into single-agent and multi-agent approaches. Finally, we thoroughly investigate up-to-date research trends and key applications that leverage Q-learning algorithms.